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ABSTRACT 

The Secondary Leve! Proficiency (SLEP) test, offered 
by the SLEP School Services Program at the Educational Testing 
Service, measures English language listening comprehension and 
reading comprehension skills. It was developed for use with 
nonnative-English speaking students in grades 7 through 12. The SLEP 
is administered and scored locally, and the SLEP program does not 
receive routine feedback from local test w^rs. This study was 
undertaken to obtain formal feedback from a sample of SLEP users 
through a survey questionnaire. Questionnaires were mailed to over 
300 potential SLEP users. Although the return rate was relatively low 
(71 usable returns) , the distribution of the returns by general 
institutional type and location was similar to that of the total 
sample. Survey findings provide information regarding: (1) testing 
practices; (2) purposes of testing; (3) selected characteristics of 
the (xaminees; (4) test-users' perceptions of the principal strengths 
and weaknesses of the SLEP and its manual; (5) the extent and nature 
of local studies concerned with validating the SLEP; and (6) related 
topics. Limitations of the findings for SLEP research are discussed. 
Four appendixes contain technical information about the survey and 
the questionnaire itself. Four exhibits, seven figures, and two 
tables illustrate the discussion. (Contains 18 references.) 
(Author/SLD) 
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Abstract 



The Secondary Level Proficiency (SLEP) test, offered by 
the SLEP School Services Progi.ain (SSP) at Educational Testing 
Service (ETS) , measures English language listening compre- 
hension and reading comprehension skills. It was developed 
for use with nonnative-English speaking students in grades 7- 
12. SLEP is administered and scored lo::ally, and the SLEP 
program does not receive routine feedback from local test 
users. The work described herein was undertaken to obtain 
formal feedback from a sample of SLEP users by means of a 
survey questionnaire. Questionnaires were mailed in April, 
1991, to over 300 potential SLEP-use contexts worldwide 
(addresses of individuals, institutions or agencies placing 
orders for the SLEP or related materials within the most 
recent 18-month period) . Although the return rate was 
relatively low (71 usable returns), the distribution of the 
returns by general institutional type and location was similar 
to that of the total sample. Survey findings provide informa- 
tion regarding testing practices, purposes of testing, 
selected characteristics of examinees (age/grade level, 
language background, and so on), test-users' perceptions of 
the principal strengths and limitations of the SLEP and/or the 
SLEP Test Manual (and suggestions for modification) , the 
extent and nature of local studies concerned with validating 
the SLEP, and so on. Implications of the findings for SLEP- 
related research and development activities are discussed. 



The Test of English as a Foreign Language (TOEFI?) was developed in 1963 by the National Council 
on the Testing of English as a Foreign L^guage, which was formed through the cooperative effort of 
more than thirty organizations, public and private, ^hat were concerned with testing tlic English 
proficiency of nonnative speakers of the language applying for admission to institutions in thfc United 
States. In 1965, Educational Testing Service (ETS) and the College Board assumed joint responsibility 
for the program, and in 1973, a cooperative arrangement for the operation of the program was entered 
into by ETS, the College Board, and the Graduate Record Examinations (GRE) Board. The 
membership of the College Board is composed of schools, colleges, school systems, and educational 
associations; GRE Board members are associated with graduate education. 



ETS administers the TOEFL program under the general direction of a Policy Council that was 
established by, and is affiliated with, the sponsoring organizations. Members of the Policy Council 
represent tlie College Board and the GRE Board and such institutions and agencies as graduate schools 
of business, junior and community colleges, nonprofit educational exchange agencies, and agencies 
of the United States government. 
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A continuing program of research related to the TOEFL test is carried out under the direction of the 
TOEFL Research Committee. Its six members include representatives of the Policy Council, the 
TOEFL Committee of Examiners, and distinguished English as a second language specialists from the 
academic community. Currently the Committee meets twice yearly to review and approve iroposals 
for test-related research and to set guidelines for the entire scope of the TOEFL research program. 
Members of the Research Committee serve three-year terms at the invitation of the Policy Council; 
the chair of the committee serves on the Policy Council. 

Because the studies are specific to the test and the testing program, most of the actual research is 
conducted by ETS staff rather than by outside researchers. However, many projects require the 
cooperation of other institutions, particularly those with programs in the teaching of English as a 
foreign or second language. Representatives of such programs who are interested in participating in 
or conducting TOEFL-related research are invited to contact the TOEFL program office. All TOEFL 
research projects must undergo appropriate ETS review to ascertain that the confidentiality of data will 
be protected. 



Current (1991-92) members of the TOEFL Research Committee are: 



James Dean Brown University of Hawaii 

Patricia Dunkel (Chair) Pennsylvania State University 

William Grabe Northern Arizona University 

Kyle Perkins Southern Illinois University at Carbondale 

Elizabeth C. Traugott Stanford University 

John Upshur Concordia University 
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Background 



The Secondary Level English Proficiency (SLEP) test was 
developed by Educational Testing Service (ETS) to assess the 
English language listening comprehension and reading compre- 
hension skills of " . . . students entering grades seven 
through twelve whose native language is other than English" 
(ETS, 1988: 5).^* More specifically, according to Stansfield 
(1984: 2) "... the test is designed for use as a selection 
or admissions instrument by private secondary schools, or as 
a placement instrument by public secondary schools." SLEP 
School Services Program publications (e.g., ETS, 1987) note 
that 

as a norm-referenced test, [the SLEP] provides users 
with the opportunity to compare student results with 
those of other students in similar situations. . ..A 
basic assumption underlying the SLEP test is that 
language ability is a critical factor in determinliig the 
degree to which secondary students can benefit from 
instruction; to succeed they must be able to understand 
what i? being said (by both teachers and fellow 
students) and to understand both formal and informal 
material written in English (ETS, 1987: 5) 

Users are informed (e.g., ETS, 1987: 5) that the SLEP® test 
can be helpful in making placement decisions such as, for 
example, 

assignment to ESL classes, 

placement in a mainstream English-medium program, 
exemption from a bilingual program, 
exit from an ESL program, 
ESL program evaluation. 

Although the SLEP test was initially developed for use with 
secondary-level (G7-12) student populations, based on informa- 
tion supplied by the program, the test is being used to assess 
the ESL listening and reading proficiency of nonnative-English 
speaking students at other age/grade levels (e.g., 6th grade 
students, college-level ESL students) , academically unclas- 
sified adults (e.g., enrollees in English-language institutes, 
adult ESL classes) ; and so on. 

Three statistically equivalent forms of the SLEP test are 
offered through the SLEP School Services Program: Form 1, 
developed in 1979-80; Form 2, developed in 1980-81; and Form 
3, developed in 1986-87. Each form is made up of 150 
multiple-choice questions of eight different types (see 



See numbered endnotes. 
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Appendix A) . Test booklets are reusable; examinees use 
separate answer sheets to record their responses to the test 
questions . ^ 

Various editions of the SLEP Test Manual (see, for example, 
ETS, 1988; 1991) provide, among other things, information 
regarding the psychometric characteristics of the test, 
evidence of validity (e.g., systematic differentiation of 
groups classified according to ESL proficiency levels, 
relatively high correlation with the TOEFL, and so on) , 
general guidelines for test use and interpretation, and 
suggestions for local research (e.g., it is recommended that 
users conduct studies designed to assess the extent to which 
SLEP scores are related to self -assessments , teachers' ratings 
of ESL proficiency, or performance in regular academic 
courses) . 



Nee<i for Feedback fron Test Vsers 

Because the SLEP test is locally administered and scored, 
the SLEP School Services Program does not routinely receive 
information pertaining to test use, local studies, and other 
related matters. The program also does not receive the kind 
of examinee-generated feedback that is routinely available for 
a centrally administered test such as the TOEFL — for example, 
examinees' test scores, item-level responses, answers to 
backgrourc' questions. 

Without such information, the program is limited in its 
a ility to judge the extent to which current forms of the SLEP 
test are meeting the ESL assessment needs of practitioners in 
diverse settings, to introduce modifications that may be 
needed to improve the overall usefulness of the test, or to 
routinely summarize, evaluate, and publish data on test 
performance for various subgroups (e.g., age/grade level, 
language background) . 



Purpose of the Present Study 

The work described herein was undertaken to obtain formal 
feedback from users of the SLEP test by means of a rurvey 
questionnaire concerned with matters such as those alluded to 
above. More specifically, the survey was designed to obtain 
information bearing on the following general lines of inquiry: 

• What are the basic patterns of test use (e.g., test 
forms used, number of examinees tested, number of times 
each examinee is tested, other assessment procedures 
used in conjunction with the SLEP)? 
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• What are the characteristics of examinee populations 
(e.g., age/grade levels, socio-political status [e.g., 
refugee, immigrant, international s*:udent] , language 
background)? If the test is being used with examinees 
not classifiable within the G7-12 range (e.g., 6th 
graders, college-level students, older adults) , what are 
the judgments of test users regarding the test's 
suitability or lack of suitability for such use? 

• To what extent ia the SLEP being used for purposes 
suggested in the SLEP Test Manual (e.g., assessment of 
the readiness to undertake English-medium academic 
instruction, placement for ESL/EFL instructional 
purposes, program evaluation, admission, monitoring the 
progress of individual students, and so on). 

• Are test users conducting local studies of the 
relationship between SLEP scores and direct measures of 
ESL/EFL students' ability to use English (e.g., teach- 
ers' ratings)? Are they developing local norms, as 
suggested by Che program? What is the scaled-score 
range that includes the average total score obtained by 
students when initially tested? 

• What are the principal strengths and limitations of 
the SLEP Test and/cr the SLEP Test Manual, from the 
perspective of local test users? What changes or 
modifications, if any, do users recommend? 

• Generally speaking, what characteristics of a 
standardized test of ESL/EFL proficiency (and related 
developer-produced materials and services) do test users 
believe would be most helpful/useful in use contexts 
similar to their own? 



Questionnaire Development 

The foregoing questions were judged to be generally 
applicable for test users regardless of location (that is, 
whether inside or outside the United States) and type of 
setting (e.g., school, college, language institute).-' 

A draft questionnaire that included both precoded and open- 
ended response options was developed, in consultation with 
program staff, and pretested.'* Based on results of pre- 
testing, it was decided that the one basic set of ques- 
tionnaire items would be appropriate for all test-use 
contexts, with only minor changes in wording — primarily in 
connection with certain testing procedures that are mandated 
by statute in the U.S. and Canada, but not e]sewhere.^ 
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The two versions are shown in Appendix B, along with the 
(undifferentiated) cover letter that accompanied each ques- 
tionnaire f cf . options for Ql and Q6 in the respective ver- 
sions of the questionnaire) . 



Defining a Target "Population" for the Survey 

Orders for the SLEP test are received from diverse 
institutions and agencies, as well as from professionally 
qualified individuals, in the United States and elsewhere in 
the world. These include public schools, private academies or 
preparatory schools, international student exchange programs, 
language institutes, corporations; postsecondary institutions 
located in the U.S., Canada, and elsewhere. 

The possibility of identifying, evaluating, and surveying 
the total population of institutions, agencies, or individuals 
ordering (using) the SLEP test during the past decade was 
considered. Howev , it was not feasible to i^ndertake the 
substantial effort that would be needed to retrieve and match 
order-files across successive fiscal cycles during the 
decade . ^ 

After evaluating and ultimately rejecting this compre- 
hensive approach, it was decided to survey a sample definable 
on the basis of records included in the "current" systems 
file — typically covering transactions over approximately the 
most recent 18 months — for the period, roughly, between July 
1989 and December 1990 . 

Computerized printouts of addresses representing distinct 
transactions (that is, one or more orders for SLEP-related 
materials) during the period were used to identify the order- 
ing institutions/agencies/individuals; additional addresses 
were supplied by the TOEFL representative office in Canada.^ 
This process resulted in the identification of 356 different 
"potential" SLEP-use contexts (that is, different purchasers 
of the SLEP test and/or related materials) . 

Based on the basic identification provided in the fiscal 
files, these potential-use contexts were classified as being 
primarily, (a) academic (secondary vs. postsecondary), (b) 
language institutes, (c) international student exchange pro- 
grams, or (d) corporations or business institutions. The 
distribution of these potential-use settings by type and 
location is shown in Table 1. 

It can be seen that most of the orders (about 84 percent) 
were shipped to academic settings — some 59 percent classified 
as "secondary-level" and 25 percent classified as "postsec- 
ondary-level . " A^^proximately 71 percent of the orders were 
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TABLE 1. Types of institutiont/AgcnclM Idontified at Havlno 
Placed Orders for the SLEP, By Location 

Location 



Type* 


U.S.A 


Canada 


Other 


Total 


Percent 


AcadMKlc, Total 


201 


49 


48 


298 


(83.7) 


Secondary 


U7 


25 


27 


209 


(58.7) 


Poatseondary 


S4 


U 


21 


89 


(25.0) 


Exchange Prosram 


11 






11 


( 3.1) 


Language Institute 


5 


1 




6 


( 1.9) 


Corporation 


6 






6 


( 1.9) 


Other** 


20 


5 




25 


i 9.8) 


Total (All types) 


253 


55 


48 


356 




Percent 


71.1 


15.4 


13.5 




(100.0) 



* Classification inferred frci* inforswtion contained in the ad- 
dresses to which shipments of SLEP test-booklets and related Mtcr- 
ials were mailed. 

** This category includes orders from institutions or agencies not 
clearly classfiable according to the preceding categories (e.g., un- 
familiar acadetnic identification, embassies, correctional institu- 
tions, individuals without institutional identification, and so on. 



shipped to addresses (including APO/FPO) in the U.S., 15 
percent were sent to Canadian addresses, and the remaining 14 
percent to all other addresses. 

About 10 percent of the sample could not be classified with 
certainty according to one of the specific categories indi- 
cated (e.g., governmental agencies; individuals with profes- 
sional, but not institutional, identification; unfamiliar 
acrohymic designations; and so on). 



Survey Mailing and Response 

Survey questionnaires were mailed on April 19, 1991. No 
final reply date was specified. During the first four weeks, 
returns were limited in volume and scattered (that is, there 
was no clearly discernible peak) . Both the timing of the 
survey (coinciding with end-of-school-year pressures) and the 
lack of consistent personal identification (e.g., name, title, 
and program) for the person actually responsible for test use, 
militated against the prospect of substantially increasing the 
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overall response through followup mailings, and none were made 
(see Appendix C for procedures used in an effort to "personal- 
ize" the basic mailing) . However, returns were not formally 
"closed" until September 30, 1991. 

As of that date, a total of 71 completed questionnaires had 
been received, distributed by typ6.-of-use context and location 
as indicated in Table 2; approximately 30 percent were 
received after June 30, 1991. The marginal distributions of 
returns by type- and location-of-use context, shown in Table 
2, were similar to the distributions that were obtained for 
the total survey sample (Table 1) . It thus appears that the 
responding sample is reasonably representative of the total 
sample with respect to both type and location of test-use 
contexts. 

In three instances, two completed questionnaires were 
returned in the same envelope; one set from two liSL teachers 
in different high schools in the same school district, one set 
from two admissions office staff members in a preparatory 
school, and a third set from two members of the ESL program 
staff at a university in the United States. Both 
questionnaires in each set were processed without special 
treatment . 

In addition to the completed questionnaires, five ques- 
tionnaires were returned unopened (for insufficient address) , 
and five were returned not fully completed. Responses to 
precoded items were keyentered. Verbatim copies of write-in 
responses were prepared to facilitate evaluation of comments, 
suggestions, and recommendations from respondents. Moreover, 
respondents who provided information suggesting that 
systematic local studies of the concurrent or predictive 
validity of SLEP test scores had been conducted, were 
contacted (by letter, FAX, and/or telephone) in an effort to 
obtain additional detail. 



Findings 

Survey findings, summarized below, provide information 
regarding (a) the scope, volume, and frequency of testing with 
the SLEP, (b) characteristics of examinee populations in 
various use contexts, (c) the purposes for which the SLEP is 
being used, (d) the extent to which SLEP users are conduct- 
ing local validation studies and/or developing local norms, 
and (e) respondents' perceptions of the most positive aspects 
and the principal limitations of the SLEP and/or the Test Man- 
ual; their suggestions for change; and their characterizations 
of the hypothetical ESL proficiency test (and test-developer 
provided services) that would be most useful in contexts such 
as their own. 

SLEP Survey 6 
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TABLE 2. OUtrlbutlon of Utturm by Typt of Um Contaxt and LocatJon 



Location 



Typ* of uae conttxt 


U.S. 


CanKk 


Othar 


Total 


Parcant 


(Mil)* 


AcadMic 


41 




11 


S8 


81.7 


(84X) 


Sacondary 
Poattacondary 


35 
6 


3 
3 


7 
4 


45 
13 


63.4 
18.3 


(59X) 
(2SX) 


Languas* Inctitutt** 

Corporation** 

Exchano*** 


3 
2 
5 


1 


2 


4 

2 

r 


3.9 

I.r 

10.0 


( 4X) 
( 2X) 
( 4X) 


Total 


SI 


7 


13 


71 






Percent 
.Mail) 


ra 

71X 


10X 
1SX 


18X 
14X 


100X 
100X 







• Entrlaa in this colum do not total 100X dua to tha fact that 
several returns were received from raprasentativts of institutions/ 
aaencies not classifiable as academic, exchanse, languaee institute, 
or corporation on the basis of information available i*en question- 
naires were mailed <th»t is, returns from "Other" In Table 1>. 



*• Type of use context was reported by respondents (see cover 
page of qj«t'«*>»1re), but academic level ties inferred from other 
infonnation available. 



Some implications of the findings are considered in the 
final section. 

SLEP Uset Basic Data 

Figure 1 shows percentage distributions of responses to 
questions about (a) forms of the SLEP currently in use, (b) 
extent of reliance on SLEP only vs. SLEP in combination with 
other ESL assessment procedures, (c) the number of individual 
examinees tested and the typical number of times each exami- 
nee was tested during the roost recent 12 month period. 

Test Forms in Usa 

Forms 1 and 2, but not Form 3, were reported being used by a 
majority of the respondents. Almost 90 percent reported using 
Form 1, 70 percent reported Form 2, and 37 percent reported 
Form 3. 
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BEST COPY AVAILABLE 



Figur* 1. S«l*ct*d procttcct in using th« SLEP t*tt, 
without rmQord to type of usa contaxt 



U«*t Form 1 only 
Uses Forms 1.2 
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Extent of R«llano« on SLEP 

One-third of the respondents reported that they used only the 
SLEP for ESL assessment; about two-thirds reported that the 
SLEP test was used in conjunction either with local assessment 
procedures only (28 percent) or v/ith local procedures, plus 
one or more additional standardized ESL tests (37 percent) . 
The number of individuals who responded to each of the 
questions involved is shown as the base for percentages. 

Volvune mnd Pattarn of Testing 

The nvxmber of examinees tested with the SLEP varied markedly 
across use contexts, ranging from less than 10 annually to 
1,000 or more.® However, about 80 percent of those who 
supplied pertinent information (only 55 of 71 did so) reported 
testing fewer than 250 individuals, and a majority tested 
fewer than 100. Some 60 percent reported that the typical 
individual was tested two times, and 40 percent reported only 
one-time testing. 
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The Examinee Population 



The SLEP test was originally developed for use with a 
population of examinees made up primarily of nonnatlve- 
speaking international students who need to demonstrate their 
ESL proficiency in connection with plans to enter an English- 
medium secondary school (G7-12) program in the United States 
or elsewhere. However, the test has been used not only with 
G7-12 students, but also with 6th graders and postsecondary- 
level students.^ 

The test-taking population also appears to include some 
nonnative-speakers who are not in "regular academic progres- 
sion" as international students planning to study in an 
English-medium environment — e.g., political or economic 
refugees, immigrants, and so on. Members of t-iese groups may 
differ from "regular students" with respect to age, educa- 
tional level, English-language background, and other 
variables . 

Accordingly the survey contained questions designed to 
assess 

(a) the extent to which SLEP is being used at various 
age/grade levels (percentage of examinees who are below 7th 
grade, in the G7-12 range and beyond the G7-12 range, re- 
spectively) ; 

(b) users' assessments of suitability/unsuitability for 
examinees below or above the G7-12 age/grade range; and 

(c) the socio-political status of the students involved 
(percentage of examinees in designated categories) . 



Age/Grade Levels of Examinees 

As may be seen in Figure 2, more than three-fourths of the 
respondents were testing at least some students in the G7-12 
range for which the SLEP was originally designed. However, 
about 3 5 percent were testing some postsecondary level 
students, and about one-fourth were testing some students 
below the 7th grade level. 

• Testing was restricted to students in the G7-12 range 
only , in only 52 percent of the settings that reported 
this information; some 18 percent reported testing only 
postsecondary level students. 

Respondents indicating that the SLEP was being used with 
examinees whose age/grade placement was either lower or higher 
than the originally targeted G7-12 range were invited to 
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Figure 2. Percentage of use-settings testing eramlnees at 
designated educational levels 
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comment on the test's suitability for the younger or older 
examinees involved, and to provide specific examples. A total 
of 2 3 respondents commented. 

The comments, by and large, were general appraisals that 
did not specify particular test characteristics or provide 
specific examples. Only the "map" items (see Appendix A) were 
singled out by several of the respondents as being 
inappropriate or too difficult for 6th graders and other 
examinees . •'■^ 

The general flavor of the comments is captured in the 
verbatim excerpts that follow (emphasis added in all 
instances) . 



Comments on "out-of-level" use . An American International 
School, South America (10 percent 6th graders). 

Ne\^ students whose first langua}ie is not English are tested for placement 
(ESL I Ibeginnersj/ESL U lintermediate-advancedj or regular class). Students 
placed in ESL are retested in the middle and at the end of the year to assess 
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progress and decide when they should be mainstreamed into the regular 
classes. We used it for 6th graders as we!'. It appears to be suitable. 

Oregon, Public Middle School (20 percent 6th graders) . 

V/e administer the SLEP test to ESL students once a y r. From the results 
I can decide whether or not the student should have to take the standardized 
achievement tests and also what support services are needed in academic areas. 
The sixth graders had no problems . They were on-task and tried to do their 
best. The older students (adult ESL) had never filled in an answer sheet of any 
kind, so I gave extra instruction and prompting as needed. All the students fail 
miserably on the map section . 

Virginia Public School (occasional 6th grader) . 

Content better suited to 7-12 graders than to sixth graders. Map test is 
definitely geared to older students . However, a great test. 

Oregon, Educational Service District (some 6th graders). 

V/e teach ESL in migrant /bilingual resource rooms in two counties in rural 
eastern Oregon. The SLEP is used each spring to evaluate growth in our 
secondary students (G7-I2). ESL is an elective class v^here students are 
enrolled in one or two periods per day. 7th and 8th grades do well: 6th graders 
do not do well: listening part of test is good. 

Michigan, Middle School (G6-8) . 

Test is administered in the fall (form 1) and spring (form 2) to evaluate 
growth and need to be in ESL class. Listening comprehension: Map and cars 
unsuitable. This section seems to be particularly confusing to the students 
because they are not familiar with the concept of driving. 

Louisiana, University-Based Intensive ESL Program. 

SLEP used to place students in our month-long intensive program. Students 
are tested when they first arrive in the program and are placed solely on their 
SLEP score. It is not used for post-testing or advancement. Most of our 
students are 18-26 and some (approximately 20 percent) 26 and older. Age 
does not appear to be a factor, except with much older students who appear to 
be intimidated by standardized tests . 
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Japan, American Liberal Arts College. 

are an American liberal arts college operating in Japan. We use the 
SLEP as one factor in determining admissions, placement, and promotion. 
Most of our students are between the ages of 18-21. These students have 
completed high school, and many have attended special schools for one 
additional year trying to get into university. SLEP is helpful in determining 
general language abilities. 

Japan, U.S. University Branch. 

Virtually all of the students tested with the SLEP exam are in the 18-20 year 
age range. The exam is fairly well-suited to our student applicants, but is 
perhaps a little more suitable for a slightly younger ag e group . 

U.S. University. 

Tliere are problems with content. . .:the car map is confusing in test form 1 . 

Japan, U.S. University Branch. 

We use the SLEP as part of the admissions assessment for Japanese students 
entering our lEP. The main purpose is for placement of students in roughly 
equivalent groups. The SLEP gives a good general assessment of student 
achievement levels . 



On balance, the comments suggest that the SLEP is perceived 
to be "suitable" for use with examinees at quite diverse 
educational levels from the 6th grade through, at least, the 
early postsecondary years. It is also perceived as being, in 
some ways, possibly less suitable for younger students (below 
the 7th grade level) than for postsecondary level students. 



Socio-Political Status of Examinees 

Figure 3 shows distributions of means of reported percent- 
ages of examinees in designated sociopolitical categories, as 
reported by survey respondents in the U.S., Canada, and other 
countries. The several distributions shown in the figure are 
generally similar to the comparable distribution reported by 
Stansfield (1984) for the basic SLEP reference group — a sample 
of ESL students in U.S. secondary schools. 
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Figure 3. R»port«d «totu« of SLEP •xamin««« In th« 

current survey: U.S^./Conada (upper panel) vs. 
other cuuntriee (lower panel) 
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In the reference group sample, approximately one-fourth of 
the examinees involved ^vere self -classif ied as foreign 
students; slightly lower percentages were classified as 
immigrants, refugees, and U.S. citizens, respectively. 

In the current survey sample, respondents from the U.S.A./ 
Canada reported an average of slightly more than 4 0 percent 
international students, compared to an average of 50 percent 
in this category reported by respondents in other locations. 
Examinee populations that included refugees or undocumented 
individuals were largely restricted to settings in the United 
States and Canada. 

Some indication of the types of demographic diversity 
represented in SLEP-use settings is provided by the following 
descriptions . 
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Canadian College. 

The age range is 18 to 70, though most students are in their 20' s;. . . classes 
for beginners with virtually no English, right up to college prep; some are 
refugees with very little formal schooling; others have the equivalent of high 
school in their mother tongue, many have university backgrounds in their 
mother tongue, but Junction in English at very low level. 

California School District. 

About 1,000 (G7-12) students rested last year; 70 percent Spanish, 50 
percent undocumented, 30 percent refugees, 20 percent recent immigrants. 



Language Backgrounds of SLEP Examinees 

Most of the respondents (67 of 71) answered a question 
regarding the language backgrounds of examinees tested with 
the SLEP. Eleven (11) language groups were designated on the 
questionnaire; respondents were asked to write in names of 
other pertinent groups. Figure 4 shows the percentage of 
respondents reporting each of the 11 designated groups and the 
distribution of use contexts according to the total number of 
different language groups reported (designated plus write-in) . 

The data in Figure 4 simply point up the language groups 
that are most consistently represented across SLEP-use set- 
tings, and indicate that the examinee populations served by 
SLEP-use contexts differ considerably with respect to degree 
of linguistic heterogeneity. In about one-third of the SLEP- 
use settings, for examp?e, only one language group is being 
tested, whereas at the other extreme, one respondent reported 
more than 50 "nationality groups." 



Purposes of Testing 

As may be seen in Figure 5, test users in the United 
States and Canada (solid bars in the figure) and their 
counterparts elsewhere in the world (hatched bars) , reported 
a generally similar pattern of testing purposes. Only a few 
respondents (about 11 percent) reported testing for only one 
of the purposes designated in the questionnaire . 

• Assessing the readiness of ESL students for English- 
medium academic instruction was the most frequently 
reported purpose for testing. This purpose was cited by 
more than two-thirds of all respondents (U. S. A. /Canada, 
64 percent; other countries, 77 percent) . 
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Figure A. Principal language groups represented in SLEP 
use settings, and differences across settings in the 
linguistic diversity of SLEP examirses 
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Figure 5. Purposes for which the SLEP test is being used, 
by locotion: U.S.A./Canada (solid bar) vs. other countries 
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• Use of the SLEP in assessing average (net) gain in 
proficiency after instruction in EFL, ESL, Bilingual, 
and similar programs, using either locally devised 
models or mandated evaluation models (U. S. A/Canada) , was 
reported by a slight majority of respondents overall. 

Data not shown in the figure indicate that about 3 3 percent 
of the U.S.A. /Canada respondents in secondary-school settings 
reported using the SLEP to assess "... average gain (using 
mandated evaluation models) in programs defined/required by 
statute (Federal/State/Provincial)." Gain assessment, without 
regard to model . was cited by about 75 percent of 
U. S. A. /Canada respondents in secondary school settings. 

• Placement of students for purposes of ESL or EFL 
instruction was reported to be a testing objective by 
more than 40 percent of the respondents. 

• Screening for admission to an institution or program 
was cited as a purpose for testing by about one-third of 
all respondents. Data not shown in the figure indicate 
that this purpose was cited by all the exchange-program 
respondents . 

• Monitoring the progress of individual students was 
cited by some 57 percent of U. S. A. /Canada respondents 
and about 31 percent of other respondents. 

All of the foregoing, of course, are well-established 
objectives of ESL proficiency testing; several illustrative 
descriptions provide a more de*-.ailed perspective. 



Illustrative Elaborations of Reported Uses 

U.S. Private Secondary School. 

International students are given the SLEP test and another assessment tool 
to determine whether they are proficient enough in English to be placed in the 
regular academic curriculum or the ESL program. Based on these test scores, 
students are then divided into fi)ur levels of proficiency and are placed in 
courses according to these levels in the ESL program. Students are tested 
again in December and in May to determine their progress in English. 

U.S. Public Secondary School for International Students. 

Used for entrance screening to assess levels of English proficiency to 
determine whether we will accept them into the academic program. Also for 
measuring progress on a yearly basis, and deciding which ESL ciasses they 
should take. 
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Intensive ESL Program (U.S. University). 



We use the SLEP to place students in our month-long intensive program. 
Students are tested when they first arrive in the program and are placed solely 
on their SLEP score. It is not used for post-testing or advancement. Some 
65 percent of students are short-term language students planning to return to 
their country upon completion of studies at the language institute. 

Exchange Program (U.S.). 

(We) use the SLEP test to assess the level of English proficiency of our high 
school aged foreign exchange students. All foreign students are given the 
SLEP test prior to their acceptance to participate in an academic year in the 
United States. Our overall objective is to assess each student's English ability 
and their ability to function in an English-speaking high .school. We use the 
SLEP test to screen our students for acceptance to the program (who must 
achieve a minimum score for acceptance). 

U.S. Public School District. 

We teach ESL in migrant/bilingual resource rooms in two counties in rural 
eastern Oregon. The SLEP is used each spring to evaluate growth in our 
secondary .students (grades 7-12). ESL is an elective class where students are 
enrolled for one or two periods per day. The class is graded and carries high 
school credit (grades 9-12). As required by our Migrant and bilingual Federal 
Program, we test every spring all secondary students being served in an ESL 
component. Teachers do informal assessments for their own diagnostic 
purposes. We've reported the scores to our program evaluator. The teachers 
use the results for their information informally only. 

U.S. College (Japan). 

We use the SLEP results to help decide on placement level for applicants 
wishing to enter our intensive English Language Program,. . .. In addition to 
the SLEP, we also administer a 25-minu:e English essay exam and conduct 
a 10-minute oral interview (with a trained ESL professional) for each 
applicant. The SLEP results comprise one-half of the overall result, while the 
essay score and interview score each comprise one-fourth. 
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U.S. Independent Secondary School. 

We ise the SLEP for all our new students from foreign-speaking 
backgrounds for placement in three different English classes. Then we give 
them the test again at the end of the year to assess the gains they have made. 
We also use it for admissions guidance if the applicant has not taken the 
TOEFL. The SLEP is studied after administration and items are studied to 
ascertain weaknesses that can then be worked on in class. 

U.S. Middle School. 

/ administer the SLEP test to ESL students once a year. From the results 
I can decide whether or not the student should have to take the standardized 
achievement tests and also what support services are needed in academic 
areas. SLEP is a standardized report card to show other school personnel that 
regular testing is appropriate or inappropriate with individual students. 

U.S. School District. 

The . . . Unified School District uses only the reading portion of the SLEP 
test for students enrolled in the Secondary Level ESL Program at middle 
schools and high schools. The assessment is given to determine entry and exit 
level reading skills. The reading portion of the SLEP test is used as 1) an 
initial or entry assessment to determine a student 's reading proficiency and 2) 
an end-of-the-year assessment to determine the student -s progress. 

U.S. Public School. 

SLEP is used to assess listening and reading comprehension of ESOL 
students in 7-12, fall and spring of each year. The scores are used (along with 
other testing data) to determine placement and exit of ESOL students in the 
ESOL program. (Especially reading suitable for LEPs). 



Local SLEP Validation and Normative Studies 

Through the SL£P Test Manual (e.g., ETS, 1991, 1988, 1987), 
the School Services Program advises test users (a) to conduct 
local studies designed to assess the relationship of SLEP 
scores to teachers' observations of proficiency and other 
pertinent performance criteria, and (b) to develop local 
norms . 

SLEP users were asked to indicate whether they had 
conducted studies along lines indicated above and developed 
local norms, respectively. They were invited to provide brief 
descriptions of any studies that may have been conducted, or, 
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in the absence of formal studies, to indicate their impres- 
sions of the relationship between SLEP scores and direct 
measures of proficiency. They were also asked to supply norms 
tables, if available. 

The specific questions posed for consideration by respond- 
ents are shown in Exhibit A. Selected SLEP reference-group 
"placement" information has been inserted opposite the respec- 
tive SLEP total score ranges (Q5d in Exhibit A) . This infor- 
mation was not included in the survey questionnaire itself. 

It was assumed that most respondents — including those who 
may not have conducted formal studies or developed local 
norms — would be able to indicate the score-range that included 
the average total scaled-score for examinees taking the test 
locally. It was hoped that some users would be in a position 
to forward reports of local studies that would provide de- 
scriptive statistics for defined subgroups (age/grade level, 
years of ESL/EFL study, and so on) and other validity-related 
evidence. However, they were not directly invited to do so. 

Local Studies and Local Norms 

It can be seen in Figure 6 that, of 71 test users responding 
to the survey, about 60 percent reported conducting local 
studies of the type described in Q5c, but only 14 percent 
indicated that local norms had been developed. 

Taken at face value, these figures suggest widespread lack 
of attention to the development of local norms, as defined in 
the questionnaire, namely, as "a table showing the percentage 
of students scoring at or below designated SLEP scores." Only 
14 percent of the respondents reported having developed such 
tables. None of the respondents supplied a norms table 
meeting the definition involved, although specifically invited 
to do so.-"-^ By inference, locally developed tables of this 
type are not "essential" for local testing purposes in 
contexts such as those represented in the survey sample. 

Notwithstanding apparent lack of attention to "local norms 
development," a substantial majority of respondents provided 
information regarding the average SLEP performance of their 
students at the time of initial testing (Question Q5d) , as can 
be seen in Figure 7. Even so, 22 percent either did not 
respond at all or indicated two or more score-categories 
(included in the NR category) . 

It is apparent from the distribution of reported averages 
in Figure 7, that the SLEP is being used with local examinee- 
populations that differ markedly, on the average, with respect 
to level of developed ESL proficiency. 
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Exhibit A 



Questions Regarding Local SLEP Studies and 
Norms Development 

Q5c. Have you been able to study the relationship 
between SLEP scores and direct measures of ESL/EFL 
proficiency (for example, ESL/EFL instructor's ratings 
of oral English proficiency; academic teachers' ratings 
of students' ESL/EFL skills)? 

1. Yes (please describe briefly) 

2. No (please comment briefly on your impressions 

regarding the foregoing, and reasons for them) . 

Q5d. Five SLEP total scaled-score categories are 
specified below. Please check the score-range that 
includes the average score typically obtained by 
students when initially tested. 



(SLEP reference group placement)* 

1. < 33 = P24 (Bilingual, Full-time, Mean = 32 ) 

2. 33-39 = P39 (Bilingual P-T or ESL F-T, Mean = 37 ) 

3. 40-46 = P57 (ESL Part-time, Mean =43 ) 

4. 47-53 = P75 (Mainstream Class, Mean = 50 ) 

5. 54 + = > P75 (No subgroup at this level) 



Q5e. Have you developed local norms for the SLEP (e.g., 
a table showing the percentage of students scoring at 
or below designated SLEP scores)? 

1. Yes (if possible, please enclose a copy of your 

norms table and related description) 

2 . No 



* The "SLEP reference group placement" data (percentile 
ranks for upper-limit of score intervals, and total 
score means for placement levels) included with Q5d 
above, reflect findings of the initial SLEP validation 
study (Stansfield, 1984; also reported in various 
editions of the SLEP Test Manual [e.g., ETS, 1987]). 
These data were not included as part of the basic 
question posed for survey respondents. 
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Figure b. Have you conducted local studies? Developed 
local norms?: Responses to Q5c and Q5e 
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Figure 7. SLEP total scaled-score range of typical student 
at initial testing (Q5d) 
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Some sense of the functional implications of differences 
in average SLEP performance is conveyed by the proficiency- 
placement levels designated in Exhibit A, above, with the 
corresponding mean total SLEP scaled scores for students in 
the respective levels (from Stansfield, 1984; also reported in 
various editions of the SLEP Test Manual [for example, ETS 
1987, 1988]) . 

Evaluation of Respondents' Comments 

A majority of the respondents accepted the invitation to 
"please describe briefly" any local studies that may have been 
conducted or to indicate their impressions of the relationship 
between SLEP scores and direct measures of proficiency as 
outlined in Question QSc.-"--^ 

Both the descriptions of local studies and study findings, 
and informal observations regarding SLEP's concurrent or 
predictive validity, varied markedly in style as well as 
substance — by inference, reflecting similarly marked 
differences in the nature, scope, and degree of psychometric 
and statistical rigor of the local studies involved. 

Almost one-half (33 of 71, or 46 percent) of survey 
respondents (24 who did not report a study and 9 who did so) 
neither commented on SLEP's concurrent validity nor provided 
information bearing directly or indirectly on SLEP's validity 
for local purposes . ■'■'^ 

The remaining survey respondents (38 of 71) provided 
comments, not all of which were deemed to be directly re- 
sponsive to the question posed. •'■^ Most of the responsive 
comments involved direct or indirect allusions to SLEP's rela- 
tionship to other measures, or SLEP's usefulness or lack of 
usefulness for local purposes — e.g., placement, including 
references to score levels at which students are judged to be 
ready to enter full-time English-medium instruction. 

Concurrent validity . The most comprehensive program of 
local validation research described by a respondent to the 
survey, involved the systematic assessment of concurrent rela- 
tionships between SLEP scores and direct assessments of oral 
English proficiency and writing skills, respectively, in 
samples of Japanese students in the intensive English Language 
Program of the Japanese branch of a U.S. university. 

Testing Director, U.S. University (Japan). 

We use the SLEP results to help decide on placement level for (such 
students). In addition to the SLEP, we also administer a 25-minute English 
essay exam and conduct a 10-minute oral interview (with a trained ESL 
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professional) for each applicant. . . . Our exam is locally developed and 
holistically scored by two readers on a 1-6 scale; our interview 'test' is also 
locally developed—students are rated on a scale of 1-7 in six areas of 
communicative behavior. 

Over a period of two academic years, from November '89 to April '91, the 
three forms of the SLEP were administered a total of 12 times as part of our 
lELP placement tests. I have observed a Pearson correlation coefficient 
ranging from .57 to .72 and averaging .63 (N = 1,648), with writing exam 
scores. . .,and a correlation coefficient range of .55 to .69, averaging .63 , with 
interview scores (in samples with initial total scaled score means in the 40-46 
range). 

The studies just described reflect an unusually thorough 
and comprehensive application of indirect and direct measures 
in placing students — use of a composite score derived from the 
SLEP, the ora] proficiency rating, and the essay rating. 

The information supplied by other respondents who com- 
mented on SLEP's validity and usefulness was not buttressed by 
citation of empirical findings comparable to the foregoing. 
At the same time, there was a relatively consistent "positive 
validity" theme in the comments — that is, relatively con- 
sistent reports, based on formal and informal observation, of 
positive relationships between SLEP scores and more direct 
measures such as those referred to in the question, and/or 
statements indicating that the SLEP had been found to be 
"valid" or "useful" for local purposes. 

Verbatim excerpts from all the comments that were deemed 
responsive to the request for information about studies of 
SLEP's validity (see note 12, above, and related discussion), 
by respondents who reported that a study had been conducted, 
reflect the general themes outlined above. A few individuals 
offered comments bearing on SLEP's validity or usefulness, 
based on informal observation only. These comments are 
identified accordingly in the summary statements that follow. 



Comments on SLEP's validity and usefulness . Oregon Public 
School . 

Teachers report that rankings of students by the SLEP generally reflect their 
own assessments. The correlation between reading scores on the SLEP and 
district graduation standards is .58. 
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English Language Institute. 

Wie asked for teacher rankings (previous to SLEP testing) and compared 
with SLEP rankings. There was high coincidence, typically over 85 percent. 

Wisconsin Public School. 

Informally we use three factors to determine a student's placement (LEP 
level). I can say that there is generally a high relationship between SLEP 
scores and performance. 

Preparatory School (Japan) . 

After having tested 200 students and worked with them for at least one 
academic year, I have noted a clear correlation between SLEP test scores and 
academic grades, later TOEFL scores, and oral English proficiency. 

Other respondents (N = 16) who commented on SLEP's concur- 
rent relationships with other measures, used language much 
like that cited above. For example: 

(There is) direct correlation between SLEP scores and other ESL tests. 

Students who score consistently higher on the SLEP are those who have 
relatively higher academic ratings and demonstrate a higher degree of oral 
proficiency. 

(There is) fairly good correlation between SLEP scores and proficiency in written 
English, but not necessarily spoken English. 

SLEP is a very accurate measurement. 

SLEP reading scores reflect instructors' ratings of students' English reading 
skills. 

There appears to be a loose correlation between SLEP results and academic 
teachers ' ratings of ESL/EFL skills. 

Relationships (with types of measures designated in the questionnaire) 
studied only informally, but the correlation is positive and seems high. 

Based on informal observation, reading scores reflect class performance, but 
I wonder about whether high scores indicate readiness for 'academic reading ' 
[from a U.S. university respondent who reported no 
formal study] . 



SLEP Survey 24 



34 



Teachers feel scores are good indicators of progress [ f rom a U.S. Y igh 
school ESL teacher who reported no formal study] . 

Several respondents commented on SLEP ' s role in placement. 

International School (Switzerland) . 

SLEP so far shows to be highly accurate in enabling placement, with the 
proviso of placement affecting performance. 

Intensive ESL Program (U.S.). 

We have not made any formal study, but have found that we cannot rely 
solely on the SLEP for accurate placement. We have probably 10-15 percent 
of SLEP testers who are moved up or down following teachers' 
recommendations which disagree with SLEP results. 

Independent Preparatory School (U.S.). 

We have been able to make cut-off scores on the SLEP that are accurate 
as far as those students' ability to achieve in the class we assign them to. 

Establishing readiness for Enqlish-medivua instruction . 

Several respondents focused their comments on SLEP score 
levels at which students are judged to be ready to enter full- 
time English-medium academic programs, or indicated placement- 
levels (e.g., beginner, intermediate, advanced — not behavior- 
ally defined in any instance) associated with specified SLEP 
score levels. 

In the original SLEP validation sample, for ESL students 
who reported that they were in "mainstream classes" (full-time 
English-medium, academic instruction) , the average SLEP total 
scaled score was 50 (see Exhibit A, above) . It is noteworthy 
that several respondents who mentioned this factor independ- 
ently identified SLEP scores at about this level as being 
indicative of readiness to enter English-medium academic 
programs. More specifically: 

American Liberal Arts College (Japan) . 

It appears from our experience that students who score 48-SO have the 
ability to communicate in English in a way that would allow them to do 
academic work for credit. 5 1 -60 usually means that their writing skills also are 
of a high enough level to engage in academic writing. 
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Canadian College. 

ESL 3, 42-54; Mainstream 55 plus. 

International School (Singapore) . 

Beginner (20-34), Low Intermediate (35-39), High Intermediate (40-47), 
Advanced (48-54). 

Public High School (U.S.). 

Students who generally get 48 and over are generally doing well and 
functioning in regular classes. They have a high transfer of skills from the 
primary language. 

Independent School (U.S.). 

We use a scaled score of at least 50 for placement into a regular English 
class. 

Student Exchange Program A (U.S.). 

We know that below 50 is a risk in one of the . . . member schools, and 
the score must be balanced by high results.on other factors. 

Canadian Continuing Education Program. 

SLEP is used to make the general distinction between ESL and high school. 
We use 55 (raw score) on Listening and 50 (raw score) on Reading 
Comprehension as an average benchmark to admit students to high school. 
(The Form 1 total scaled score equivalent is 48). 

Student Exchange Program B (U.S.). 

Minimum scaled score for acceptance next year is 50. 

Other respondents simply indicated that having relatively 
high SLEP scores was important to successful performance in 
English-medium programs — they did not cite clearly interpret- 
able score levels. For example: 
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College-Related Preparatory School (Canada) 



// is clear that a student in the 85-90 percentile is ready for the regular 
English program at our school and can integrate confidently into it. 
(Reports initial means in the 47-53 range, SLEP 
reference group: 60th to 75th percentiles) . 

Exchange Program C (U.S.)' 

We have found that those students who far surpass the minimum score (not 
indicated) set for acceptance in the program usually have little difficulty 
functioning in an American high school. (Typical student is in the 
40-46 range when initially tested) . 

The comments of respondents, on balance, warrant the 
following general conclusions: 

SLEP scores were positively related to other measures 
of proficiency, including direct measures of oral 
English proficiency, essay ratings, teacher's ratings, 
and so on. 

SLEP scores provide generally acceptable (useful, 
accurate) bases for placing students according to 
proficiency levels, with the usual provision for 
adjustment in placement, based on actual performance in 
classes at the initial placement level. 

SLEP scores also have proven to be useful for 
screening prospective participants in exchange programs 
involving selection of students aspiring to study in 
English-medium preparatory schools. 



Users' Perceptions of SLEP's Strengths and Limitations 

Test users were invited to indicate what they perceived to 
be positive and/or negative aspects of the SLEP and the SLEP 
Test Manual and to make suggestions for improvement, using 
Questions Q8 through QIO, as indicated below: 

Q8. What do you regard as the most positive features 
of the SLEP Test (considering the uses indicated above)? 

Q9. And what are its primary limitations, from the 
same perspective? 
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QIO. What do you regard as the most positive/negative 
features of the SLEP Test Manual? What changes and/or 
additions to the Manual or the test itself would be most 
helpful to you? 

Generally, comments regarding SLEP's most positive features 
emphasized various aspects of administrative convenience (ease 
of administration, availability of self-scoring answer sheets, 
and so on) . Also emphasized were the fact that the SLEP is a 
standardized test of ESL proficiency that can be administered 
and scored locally, and that SLEP is valid (useful, helpful, 
accurate) for intended uses, and provides measures of both 
listening comprehension (sometimes referred to as "oral") 
skills and reading skills. Some respondents singled out the 
listening comprehension section for positive comment, while 
others (fewer in number) were especially impressed by the 
reading comprehension section. 

Comments regarding perceived limitations of the SLEP, on 
the other hand, are less amenable to general summarization 
than those regarding positive features of the test. Whereas 
most of the positive comments pertained to identifiable 
aspects of the SLEP test itself (e.g., ease of scoring; 
validity or usefulness for local purposes, based on expe- 
rience) , many of the limitations mentioned were not "SLEP spe- 
cific." Rather, they appear to be generalizable to any ESL 
proficiency test that only provides measures of listening and 
reading skills or to ESL proficiency assessment generally. 

The comments of one ESL teacher in a public high school 
situation serve not only to highlight the recurring theme in 
these comments but also to suggest a logical, albeit difficult 
way to overcome the limitations involved: 



ESL Teacher, U.S. High School. 

(There) is no testing of oral language (writing skills). I would like to see a 
SLEP test that would include all four skills: listening, speaking, reading, and 
writing; easy to administer whether to . one student or many at the same time. 



Variations on this theme are discernible in several brief 
excerpts from more extended comments. 

SLEP does not measure student's ability to use the language directly;. . . has 
no writing/grammar component-must he supplemented; more (needed) on 
grammar and usage; . . . (we) need a test dealing with a broader range of skills 
for accurate assessment of progress; the SLEP does not measure proficiency in 
producing language (e.g. .writing, speaking); . . . (it) only includes reading and 
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listening comprehension; . . . (we) need a writing documentation since SLEP 
tests only LC and reading; (only a) limited range of knowledge (is) tested for; 
. . . (SLEP) does not measure speaking ability; . . . doesn 7 test written discourse; 
(the SLEP) does not assess production. 

A smaller number of respondents indicated that high 
performance on the SLEP does not necessarily indicate a 
comparably high level of functional ability to deal with 
"academic English" in the classroom or in tests. 

An ESL teacher offered the allowing comment, as well as 
pertinent interpretive insic,ht: 



ESL Teacher. 

My students must pass a standardized English reading test at the 40th 
percentile; they often reach 90 percent on the SLEP but still are only about 20 
percent-35 percent on English test. This may not be a limitation of the SLEP, 
but may deal with expectations of N. Y. State and this standardized English 



A few others commented more generally on the foregoing, 
typically observed pattern in the field of ESL proficiency 
assessment, as follows: 



The SLEP doesn 't test 'academic reading ' ability; high-scorers may not be 
able to perform well academically. 



The comments on perceived limitations of the SLEP, focus 
attention on the complexity of the assessment problems that 
confront ESL practitioners in SLEP-use contexts. 

Other indicated limitations and/or suggested changes in the 
SLEP and/or the SLEP Test Manual call attention to specific 
modifications that are worthy of consideration on their merit, 
without regard to frequency of mention. For example: 

• Include normal curve equivalent JNCE) conversions of 
percentiles in the SLEP Test Manual.^ 

• Provide a separately scored vocabulary section. 

• Off*^r up-to-date norms; norms foi specific subgroups. 

• Additional forms would be useful. 
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• Provide more assessment of grammar /usage . 

• Provide a taxonomy of item types according to the 
specific linguistic skills they are designed to assess (to 
enhance the usefulness of SLEP for diagnosis or for a more 
specific, curriculum-linked assessment of change). -^^ 

As noted earlier, the listening comprehension "map" items 
were negatively mentioned by seven respondents. Said one 
respondent. 

In order to do well on these items, not only did one have to have good 
auditory memory, but also good spatial memory. 1 had difficulty with it, as I 
have poor spatial orientation. 

Other comments about the "map" items were less generically 
critical. For example: 

flTiese items are) too difficult; . . . almost impossible for most students; . . . 
extremely difficult because of the inference that has to be done--i.e., where to buy 
a magazine. Or, map confusing, can't tellfroru of car from back easily. 

Of course, it does not necessarily follow from these 
comments that the "map" items are less valid than other itt.n 
types in the SLEP. These comments indicate only that 
attention to both format- and validity-related questions 
regarding these items appears to be warranted. 

The comments and suggestions by SLEP users, summarized 
above, point out potentially important general directions for 
further development end/or modification of the SLEP Test 
and/ or the SLEP Test Manual. 



The "Ideal Test PacXaqe" 

"You have commented on aspects of the Secondary Level 
English Proficienc;- test, and related matters. More 
generally, plf.acs describe briefly the characteristics 
of a standardized test of ESL/EFL proficiency (and 
related developer-provided materials/services) that 
would be most helpful/ useful in EFL/ESL assessment 
contexts similar to yours." 

Nineteen respondents provided comments and/or suggestions 
regarding an idealized ESL "assessment package." As might be 
expected from many of the comments on "limitations" reviewed 
above, a recurring theme was that the "ideal" test battery 
would provide for assessment of all four basic skills and 
offer enhanced diagnostic potential. 
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The potential usefulness of a "lower level" test was noted 
by two respondents. Some respondents wanted a somewhat ort- 
er test, with features designed to facilitate its administra- 
tion. Others wanted a test that was free of "cultural bias" 
and "gender bias." 

The detailed comments providing the basis for the foregoing 
summarization are included below. In several, less detailed 
comments, some practitioners petitioned for a breakdown (of 
information regarding test items) like that done by the 
publishers of CTBS tests (a U.S. achievement test battery), 
updated data on the relationship between SLEP and the TOEFL, 
or a measure of ability to "use academic language." 

One teacher called attention to the complexity of 
assessment involving "third-world" students with limited 
academic preparation, while another (from an independent 
secondary school in the U.S.) indicated simply that the "SLEP 
seems to provide most of what is needed at this particular 
school . " 

More comprehensiva assessment 

Testing Director, Japanese Branch U.S. University. 

For the purposes of placement, a more comprehensive standardized test 
would be welcome, i.e., one which includes balanced components measuring 
writing ability and speech production in addition to listening and reading 
comprehension. Given the homogeneous nature of our particular EFL context, 
however, 1 believe that a test developed for this particular population might be 
more useful. It would he difficult for me to describe a standardized test that 
would be more appropriate. 

ESL Department Head (Canada) . 

In addition to the listening and reading skills, some organized way of 
measuring speech and writing (would he helpful). It mu.st he something that can 
be administered without a heavy commitment of instructor time. 

English Department Chairperson, Academy (U.S.). 

A thorough assessment of a student's proficiency in English usage, grammar, 
speaking, reading, etc. Content and context geared to high school students' 
interests, experiences. Test which is easily scored. 
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Director of Testing, Preparatory School (U.S.)- 

/ want to see vocabulary (in context) strength (or weakness), grammar 
knowledge, (prepositions, verb usage). Idioms are not important at our level--the 
students pick these up in class. Oral expression might be assessed by audio-tape. 
For admissions purposes, we cannot handle students who cannot make 
themselves understood at a primary level. 



ESL Teacher, Migrant Education (U.S.). 

/ would like to see a standardized test that would tell me the areas of 
weakness. Something that would be helpful to teachers, so they could zero in on 
the areas of weakness and provide practice and language instruction that would 
improve these areas. 

ESL Dept, International School (Japan) . 

/ would like to see a production component and some consistent assessment 
of production added or available as a supplement. 



Reduce testing time 

Head, ESL Program Canada (College-related School) . 

A 45 to 60 minute listening/reading comprehension test which requires no 
introduction by the teacher and which has introductory instructions in many 
different languages so that the student can begin with confidence. 

ESL Coordinator (U.S. High School). 

A test that included the testing of the 4 skills-listening, speaking, reading, 
writing. A test that takes less time to administer and grade. It is difficult in my 
program format to administer SLEP as a determining factor for ESL program 
entry (emphasis added). 



A "lower- level" test 

EFL/ESL Specialist, School System (U.S.). 

It would he most helpful if a reading comprehension test at a lower level was 
available. Many of our refugee students have little or no education. I would like 
to be able to assess their skills better. However, this is a good test, and I intend 
to continue to use it. 
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English Director (U.S. Public School). 

A second language test for students in lower grades would be helpful. 



Freedom from culvural and gender bias 

Executive, American School (Europe) . 

Ours is a unique program-not placement alone--but personal qualities must 
be assessed. SLEP serves its purposes but would never serve alone. Only 
addition would be writing sample. I assume research is done in regard to gender 
and cultural background being unbiased. This would be very important-that 
SLEP test is not biased to sex and to American cultural background (but 
probably unavoidable). 

Executive, International College (Japan) . 

Tests without a lot of culturally biased vocabulary and subject matter. A 
weighted test which can easily be used to level or sequence students and 
curriculum needs. 



The opportunity for humor lurking in an invitation couched 
in such a way as to suggest the possibility of devising a test 
that would meet the extremely complex assessment demands that 
confront ESL practitioners, was seized upon by one respondent 
who characterized the ideal test as follows: 



Supervisor, ESL/Bilingual Program (U.S.). 

The ideal test would serve weh for both student assessment and program 
evaluation. It would be a criterion-referenced test (magically based on our own 
curriculum) that Can also be interpreted by norms. Information would be 
sufficiently rich to yield placement, diagnostic, and normative interpretations. In 
addition, such a test would provide data on growth, gains, and gap reduction 
that would satisfy federal reporting requirements and our own omnivorous 
curiosity. 
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Siimmary 



The findings of this survey represent the results of an ad 
hoc, formal effort to obtain feedback from practitioners in 
diverse SLEP-use contexts. Although small, the sample of re- 
spondents appears to be generally representative of the sample 
surveyed with respect to type of institution and location 
(U.S. A, Canada, other country). General trends in findings 
are summarized below. 

• Almost 90 percent of the respondents reported using 
SLEP Form 1, 70 percent reported using Form 2, and 37 
percent reported using Form 3 . 

• About one-third of the respondents indicated that the 
SLEP was the sole ESL proficiency measure being used. 

• The number of examinees tested annually varies con- 
siderably across use contexts; about 34 percent re- 
ported testing less than 50 examinees, and 19 percent 
reported testing 250 or more examinees. 

• In the majority of cases (60 percent) examinees are 
tested at least two times. 

• Slightly more than one-half (52 percent) of the sample 
reported that testing was restricted to students in the 
G7-12 range; more than one-third reported testing 
college-level students; some 20 percent reported testing 
sixth graders. Respondents' comments indicate that the 
SLEP is perceived to be generally suitable for use with 
examinees at quite diverse educational levels ranging 
upward from sixth grade through college — but may tend to 
be relatively more suitable for college-level than for 
sixth-grade level examinees. 

• Respondents from use contexts in the U.S.A. and Canada 
reported local populations comprising not only 
"international students" (typically accounting for about 
43 percent of examinees) , but also resident aliens, 
recent immigrants, refugees, undocumented individuals, 
and so on. In other countries, testing populations 
comprised primarily local residents and other nonnative- 
English speakers studying or planning to study in 
English-medium preparatory schools or colleges, situated 
locally or elsewhere. 

• Local examinee populations differ rather markedly in 
heterogeneity of language background. In about one- 
third of the settings, only one language group is 
represented; in some 29 percent of the settings, eight 
or more language groups are represented. 
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• The SLEP is being used, typically, for at least two of 
the purposes that are recommended in the Manual — that 
is, to assess readiness for undertaking full-time 
English-medium academic programs (in about 70 percent of 
use contexts) , to assess average gain (estimated 50 
percent) for ESL placement (about 45 percent) , or in 
screening for admission to institutions or programs 
(about 33 percent) . In addition, some 57 percent of 
U. S. A. /Canada respondents and about 31 percent of all 
others indicated use of the SLEP for monitoring the 
progress of individual students — a practice not 
specifically mentioned in the Manual. It was not 
described in detail by any of the respondents reporting 
it.^^ 

• Slightly more than one-half of the sample (58 per- 
cent) indicated they had conducted local studies of the 
relationship between SLEP scores and other measures 
(e.g., teacher's ratings of oral English proficiency). 
Only 14 percent reported they had developed "local norms 
for the SLEP (e.g., a table showing the percentage of 
students scoring at or below designated SLEP scores)." 
By inference from the nature of the comments provided by 
respondents, some local assessments of SLEP ' s validity 
and usefulness are quite sophisticated, but most of them 
are relatively informal — frequently involving primarily 
clinical perception rather than statistical documenta- 
tion . 

• At the same time, there was a relatively consistent 
"positive validity" theme in the comments. Respondents 
relatively consistently reported having observed 
positive relationships between SLEP scores and more 
direct measures, such as those referred to in the 
question (see Exhibit A, above) . They often indicated 
generally that the SLEP had been found to be "valid" or 
"useful'" for local purposes. 

• Respondents named as "positive features of the SLEP 
Test," its administrative convenience, the fact that it 
is a standardized test of both listening and reading 
skills, and its validity/usefulness for local purposes. 

• Comments on "negative features" of the SLEP typically 
did not single out for criticism any specific features 
of the SLEP. Rather, the recurring theme reflected a 
need for a more comprehensive measure. More 
specifically, the fact that the "SLEP does not test 
production" was mentioned with relative frequency as a 
limitation of the SLEP. Several respondents also 
indicated, as a negative feature of the SLEP, that high 
performance on the SLEP does not necessarily indicate a 
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comparably high level of functional ability to deal with 
"academic English" in classroom settings — clearly a 
generic problem. 

• Nineteen respondents accepted the invitation to de- 
scribe a "test package" that would be most helpful/ 
useful to them. Consistent with the general comments on 
SLEP's "limitations," a recurring theme was that the 
ideal test battery would provide for assessment of all 
four basic skills and offer enhanced diagnostic poten- 
tial. Less prevalent themes called for reducing testing 
time by developing a shorter test, a test for lower 
grade levels, and -a test that is free from gender or 
cultural bias. 

Some of the implications of these findings for research and 
development (R&D) activities involving the SLEP are discussed 
in the following section. 



Implications of the Survey Findings 

The information, ideas, comments, and suggestions of survey 
respondents are useful and important, on merit, without regard 
to statistical considerations generally or to the fact that 
only a small percentage of the total population of SLEP-users 
responded to the survey questionnaire. The responding sample, 
as indicated, appears to be representative of the general 
test-using population. The fact that the SLEP is being used 
relatively extensively with postsecondary-level students is 
noteworthy; more than one-third of the respondents reported 
that the SLEP was being used with college-level students. 

Based on respondents' descriptions of findings of local 
studies and/or their clinical observations, scores on the test 
have been found to be positively related to other indices of 
ESL proficiency, including direct assessments of oral English 
proficiency and writing skills, across samples from diverse 
test-use contexts. 

This feedback and other pertinent evidence^° suggests as 
a strong working hypothesis that the SLEP can be expected to 
provide reliable and valid measurement of ESL listening com- 
prehension and reading skills in samples of college-level 
students as well as in samples of younger students. Thus 
SLEP's identification as a test designed for use with "second- 
ary level" students appears to be unduly restrictive in its 
connotations . 
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Additional research is needed, however, to establish SLEP's 
validity in an expanded population, to extend evidence of 
validity generally, and to develop up-to-date and compre- 
hensive reference-group data for samples from representative, 
current, and potential SLEP-use contexts. 

Generally speaking, formal research-based evidence bearing 
on SLEP's reliability and validity is relatively limited — as 
compared to the large body of evidence bearing on TOEFL' s 
validity, for example. The only reference group available 
reflects the performance of ESL students in grades 7 through 
12 in approximately 50 U.S. public schools, tested circa 1980 
(with SLEP Form 1) . The need for updated and expanded 
reference group data for the SLEP was specifically noted by a 
number of respondents and is recognized by the SLEP School 
Services Program. 

The development of current, comprehensive reference-group 
daca for both secondary-level and postsecondary-level samples, 
classified by gender, language background, and other pertinent 
variables, is needed to enhance the usefulness of the SLEP (as 
well as to establish or maintain its "cert if iabi lity" for use 
in certain contexts). 

Local SLEP users, in postsecondary-level and high-school 
level settings, are in a position to contribute directly to 
the development of reference-group data and additional formal 
evidence bearing on SLEP's validity by participating in 
cooperative studies designed to collect SLEP scores, back- 
ground data, and pertinent criterion data from representative 
testing contexts. 

Selected SLEP users might be invited to provide SLEP data 
and ratings or scores on a "common criterion measure" (e.g., 
grades in ESL courses, ESL teacher's ratings of proficiency 
according to a standard scale, and so on) for defined samples. 
Given such data, it would be possible to conduct centrally the 
types of analyses needed to assess the strength and consisten- 
cy of association between SLEP scores and the criterion meas- 
ure (s) involved. 

Because the TOEFL is widely used and has been extensively 
validated for postsecondary-level samples, it would be useful 
to conduct studies designed to extend evidence regarding the 
strength and consistency of SLEP/TOEFL relationships in post- 
secondary-level samples. Similar studies should be conducted 
in samples of secondary level students in settings where 
attaining levels of proficiency indexed by TOEFL scores 
represents an important goal for the students involved. '^^ 
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other Avenues for SLEP-Related R&D Activities 



As a measure specifically designed to gauge listening 
comprehension and reading skills, the SLEP obviously cannot 
meet the complex range of assessment needs and concerns 
expressed by the respondents to this survey — including the 
need for comprehensive assessment of productive as well as 
receptive skills. 

Promote widespread use of standard procedures for rating 
productive skills 

It is important to promote the local use of standard 
procedures for assessing writing and speaking skills. For 
example, consideration might be given to the development of 
brief, behaviorally anchored z^ting schedules that ESL 
teachers could use in rating essays or speaking ability — 
perhaps adaptations of currently available scales for 
evaluating these skills. ^'^ In any event, it seems important 
to encourage SLEP users to adopt standard procedures fo r 
rating basic skil ls --procedures whose usefulness could be 
explored in cooperative studies in which the ratings 
constitute "common criteria" across use contexts. 

Explore SLEP's validity below the G7-12 range 

Survey findings indicate that the SLEP is being used for sixth 
graders in a number of contexts. Evidence bearing on SLEP's 
"suitability for use with 6th graders," is quite limited. 
Some respondents suggested the potential usefulness of a 
"lower level" of the SLEP for use below the G7-12 range. 
Research is needed to assess SLEP's difficulty, reliability, 
and validity in samples below the 7th grade level. 

Available evidence (e.g., ETS, 1991; Holloway, 1984) sug- 
gests that most native English-speaking seventh graders have 
"mastered" the skills measured by the SLEP — that is, they tend 
to "top out" on the SLEP, more so on listening comprehension 
than on reading. At what age/grade level do SLEP items begin 
to represent relatively difficult cognitive tasks for native 
English-speaking students? A study designed to answer this 
question would provide information that is pertinent to the 
problem of establishing the lower "age/grade limit" of SLEP's 
applicability. 

Increase SLEP's "assessment efficiency" 

The amount of time required to administer the SLEP — the amount 
of time needed for placement testing generally — was a matter 
of some importance for a number of the ES^ practitioners who 
responded to the survey, as it was to those interviewed a 
decade earlier by Hale and Hinofotis (1981: pp. 10-11). It 
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seems important to consider research and development 
activities designed to explore options that might result in 
"increased efficiency of measurement," for assessments 
involving the SLEP test — for example, by reducing testing 
time, and by introducing features that would capitalize on the 
general (class-level) diagnostic potential inherent in SLEP 
items . 

Explore the reliability and validity of a shorter test 

Time needed for ESL proficiency assessment is a matter of con- 
siderable importance in SLEP-use contexts. The SLEP, as pres- 
ently constituted, requires approximately one and one/half 
hours of testing time. In one major testing context involving 
member institutions of the Los Angeles Community College 
District (LACLCD) , the colleges use an "abbreviated" version 
of the SLEP for the express purpose of reducing the total 
amount of time needed for placement testing. 

The LACCD reduces SLEP-testing time simply by not 
administering two of the sections (Tillberg, 1991, personal 
communication) . This approach to reducing testing time and 
the particular item types selected for inclusion/exclusion 
were, as recommended by Butler (1989), based on analyses that 
included an assessment of the comparative validity of scores 
on the full and shortened versions for discriminating among 
independently defined ESL proficiency-placement groups. 

It would be useful to conduct research designed to assess 
the effects of the approach described above and other ap- 
proaches to reducing the length (and time required for admin- 
istration) of the SLEP test, on reliability, concurrent valid- 
ity, validity for placement and other specific purposes, and 
so on. Exploratory research might be conducted, retrospec- 
tively, using existing data sets that include item-level 
scores for the complete SLEP test and criterion scores 
(teacher's ratings, and so on). 

Assess contribution of item types to validity 

Little attention has been given to assessing the comparative 
validity of the respective SLEP item types for predicting 
basic performance criteria (e.g., ratings of oral language 
proficiency or writing ability). ^° Studies of the relative 
validity of SLEP item types would contribute information that 
is relevant to the problem of developing a shorter test. The 
studies, incidentally, would also contribute to an empirical 
evaluation of the validity-related properties of the map items 
that were mentioned negatively by several respondents. In 
this same general area, it would be useful to analyze the 
factor structure of the SLEP using, for example, data sets 
supplied by SLEP users. 
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Enhance SLEP's general diagnostic potential 



The SLEP was developed to provide a reliable and valid basis 
for assessing ESL listening comprehension and reading compre- 
hension. Attention naturally is focused almost exclusively on 
the reliability, validicy, and usefulness of these two scores 
and the total score. Little attention is paid to the variety 
of subskills that may be tapped by different sets of test 
items. These are of potential interest to ESL teachers and 
others interested in identifying general instructional areas 
that may need more or less attention in plans for instruction. 

According to one ESL/Bilingual supervisor, for example: 

Although the SLEP is not a criterion referenced test, it would be helpful to 
know what underlying skills or curriculum goals, if my, are addressed by the test 
items. 

Developing a taxonomy of skills/functions tapped by 
existing SLEP items would contribute directly to increased 
efficiency of assessment by enhancing the general diagnostic 
potential of the test. Even though the SLEP. is not designed 
with particular curricular goals or discrete skill development 
in mind, the types of skills/function represented in the test 
items are likely to be common foci of instruction in most ESL 
curricula. Averages based on subsets of items by skill areas 
appear to have potential value for general evaluation and 
instructional purposes. 



Cultivate "Cooperative Interaction" with SLEP Users 

It would be useful to consider procedures designed to 
promote closer ties and more frequent professional and col- 
legial interaction between the SLEP School Services Program 
and the ESL practitioners who administer and use the SLEP in 
diverse local settings, worldwide. 

An important, albeit simple, step in that direction would 
be to modify SLEP ordering procedures by asking for full 
professional and personal identification of the "individual 
who will be responsible for using SLEP." This information is 
essential to the definition of a "population of SLEP users," 
as well as "SLEP ordering institutions." Lack of personal 
identification for SLEP users complicates efforts to interact 
with those who are actually using the test, as indicated by 
the difficulties the present survey encountered in identifying 
and contacting "SLEP users", outlined at the outset. 
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other steps that might be considered include establish- 
ment of a "SLEP Advisory Service," including a toll-free "hot- 
line" through which practitioners can raise and receive 
answers to questions about SLEP use and interpretation. A 
periodic "newsletter" would provide a means of keeping test 
users informed of developments regarding SLEP. If SLEP users 
were encouraged to provide reports of local studies, these 
results could be shared periodically with all test users 
through the newsletter, and so on. 

Steps taken to encourage and facilitate professional 
interaction between the SLEP School Services Program and SLEP 
users should be beneficial to all involved. Consideration 
might be given to the development of a model for implementing 
a program of cooperative interaction between the School 
Services Program and SLEP users, that would involve periodic 
data feedback from test users in exchange for central analysis 
and reporting by the program. 
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Appendix A: Illustrative SLEP ItMis 



Sample Questions 



Section 1 

The first section d the SLEP test measures ability to underecand spoken 
English and is 35-40 minutes long. It is divided into four pare, with four different 
types d questions. 

Part A 

For the first type d question, the student must m;»tch one of rour recorded 
sentences with a picture in the test book The sentoKes are spoken only once and 
are not printed in the test book. This part contains items dealing with correct 
recognition d mirumal pair contrasts, juncture, stress, sound clusters, tense, voice, 
prepositions, and vocabulary. 

SampU Questions 

Note: Pictures are for illustrative purposes only. 1 
Actual pictures and drawit^gs in the test book- 
let are two to four times larger than sample 
pictures in thb brochure. 

1. On tape: 

hook a the picture mcaked I. 

On tape: 

(A) There is an antHf in the sky. 

(B) ThtbuiStimghasatalluAiier. 

(C) The judge is faowwig his head. 

(D) There is a in /ront o/ the buiiding. 




2. On tape: 

Look at (he picture marked 2 . 

On tape: 

(A) The bird is standing on tot) o/ the pok. 

(B) The bird is /lying otier the /ence. 

(C) The bird is diging in the sand. 

(D) The bird is eating the grass. 
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3. On tape:, 

Looic « piaure nuHc«d 3 . 

On tape: 

(A) There'^astatutofalion. 

(B) The lin« is very straight. 

(C) TT\ewmeisneccrt}K\imdovj. 

(D) There's a ian« n«ar building. 





4. On tape: 4 

LookatthepictunwaAedA. 

On tape: 

(A) The brain is protected by bone. 

(B) The train isond\e track. 

(C) The drain is stopped 

(D) The tain is coming down. 

Parte 

These questions approximate the type of dictation exercises used frcquendy 
in English language classes: the student must match a sentcrx:c printed in the test 
book with a sentence heard on die tape. The questions focus on die relationship 
between structure arvi meaning. 

Sample Questions 

1. On upe: "The class can /mishit in less than on hour. 

Intcstboolu (A) The classes can't finish in half an hour. 

(B) The class won't be finished for an hour. 

(C) The classes will take at least an hour. 

(D) The class can finish it in less than an hour 

2. On upet Wkf aren't thejf fixing the car.' 

In test book: (A) Arc diey fixing the car? 

(B) I'm fixing the cat 

(C) Why aren't diey fixing die car? 

(D) The car has been fixed 

3. Ontape: VC'hile I uas utwing/or m^^ sister, shegot theneu;s. 

In test book: (A) While I was waiting for my sister, she got the news. 

(B) While my sister was waiting for me, she got the news. 

(C) I was waiting for my sister to get die news. 

(D) I was waiting for my sister when I got the news. 



SLEP Survey 



Appendix A, con't: Illustrative SLEP Itf^ms 



4. Onttpe: HedMthvMikowwgatothegfm. 

In teit bookt (A) He didn't go to the gym. 

(B) He explained how to use the gym. 

(C) He told us to get to the gym. 

(D) He didn't know how to get to the gym. 

5. On tape: BiU has one tmx^\er and one sister, <ind so does Jane. 

In test book; (A) Bill has one brother and one sister, and so docs Jane. 

(B) Bill has one brother and a sister named Jane. 

(C) Bill and Jane arc brother aivi sister. 

(D) Bill's brother a-d sister like to be with Jane. 



Parte 

For the second type of question, the student refers to a map in the test book 
(see page 11). Streets and buildings on the map are labeled, and there are four cars, 
marked A, B, C and D. The studeni: must choose the one car that is the source of 
a brief conversation on the lecording. The questions in this pan: assess a variety of 
liiiftuistic, cultural, and pragmatic cotKcpts. Th :sc include directions, recognition 
of building names arvJ associated vocabulary, distance, and time. 

ScanpU Questions 

1. On tape: 

(man) Themuteximkaaspeddcdutitt}uiiveek.W^ 
(woman) I'dlifcetoie-yTmicK. I/wcoruinueonN^adcerdtotheciTCleandgo 
around to Salmon, can park on Cod Lone, 
(thirdvoice) WJudi car are the people in? 

2. On tape: 

(man) Iuotjyiil<eto/mdtheu«y tothedrde. FromthcTe, Iloiowhou'to 
get home. 

(woman) It's not too hard. 1/ u« bear right into Bass and then go south on 
Saimon, u« uaII end up at (he drde. 
(thirt'i voice) car ore the people in? 

3. On ttpc: 

(woman) The /u£fee$ are going to hear a veT7 interesting coj* today. Let's siop 
at '"he courts. 

(nun) That's a good idea. I'll go north c( the next intersection and cross Pike 
Avenue. We can peak kithebt aaou the street fnm (he courts, 
(thirdvoice) WhkJi car are the people iru' 
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PartD 

The questions in this pan are based on conversations, recorded by Ameri- 
can high school students, that represent typical secondary school situations. The 
conversations take place in various parts of a school arxl deal with events that 
typically occur in each location. Conversations also deal with extracurricular 
activities, academic subjects, school closings, arxi holidays. For each recorded 
question, the student must choose one of four answers prinred in the test book. 

Sample Questiana 

1 . On tape: 

(Bob) I heard that it is suppovd to be a vay good band. Since the gprAe 
stansat7:30, Nancy, I'ipickyouupa?. 
(Nancy) That's fine. I'll be ready. It takes 15 minutes to get to gym, so 
w'Uhameame. 
(third voice) At what time iuilL they arrive at the gym! 

In test book: (A) 6:45. 

(B) 7K». 

(C) 7:15. 

(D) 7-30. 

For questions 2 and 3. 

2. On upc: 

(Nancy) }ane, ivht are you going to uieor to the game? 
Oane) I'm not sure yet. I don't want to haue a heavy sweata on a the 
dance. It'll be pretty warm in the gym. I'D probchly wear a ligk 
dress, even thotigfi the weather outside migfu not be so warm. 
(third voice) What is d\egrd going to wear? 

In test book: (a) A heavy sweater. 

(B) A heavy coat. 

(C) Some li^t slacks. 

(D) A light dress. 

3. (On tape) \(/hat the ffrl's reason for dus decision? 

In test book: (A) She expects it to be cold outside. 

(B) She expects it to be warm inside. 

(C) It is going to snow. 

(D) It will be very windy. 
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Section 2 

The second section d die test is 40 minutes long and measures ability to 
understand Written English. The questions cxDver gtamrnar, vocabulary, and 
reading comprehension. There arc three parts to Section 2. 

Part A 

For each question in diis part, the stixlent must match die reaction d one of 
four characters in a cartoon widi a printed sentoKC. 




SctmpU Questiana 

1 . All those wet clothes. The children will want to stay outside and I'll spend my 
time trying to keep them dry. 

2. I can hardly wait to make the fint stxjwball. I've been waiting all year to get 
back at her. 

3. Oh, my aching back. The car will be covered arvd I'll have to shovel it out. 

4. Isn't it great that school mig^it be closed? I'd much radicr have fun outside dian 
stay in school. What better way to spervi a snowy day. 

5. I'm going to be awfully hungry, I shouldn't have hidden that bone. It would 
have been better to leave it in the house. 

Parte 

For die questions in this part, die student must match a printed sentciKe 
with one of four drawings. The particular focus of this item type is the use of 
prepositions, prorwuns, adverbs, and numbers. 
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ScmtpU Questiom 

1 . One giri is eating ice cream but two aren't. 



1 t ^ 

i T T T 

I 



^ : 




2. The small square is in the upper left comer. 



! I 



3- He is bending over to pick up the box. 






4. The car almost hit him while he was crossing the street. 




SLEP Survey 



Appendix A, con't: Illustrative 8LEP Xtems 



Parte 

This part of Section 2 contains questions of two types. In one, the student 
must conipletc passages by selecting the appropriate words or phrases from among 
four choices printed at intervals in the passages. 

SampU Passage and Questians 
1. Souivd is something we 



(A) hears. 

(B) hearit^. 

(C) heard. 

(D) hear. 



It comes to your 



(A) eyes 

(B) nose 

(C) can 
(U) mouth 



in different ways. It might be pleasant, 



3. like the voice of a friend. 



(A) when 

(B) as 
(Oor 
(D) since 



unpleasant, like the screech 



4. of a train's wheels on a railroad 



(A) station. 

(B) track. 

(C) light 

(D) conductor. 



Some sourds arc loud. 



5. and some are sofr, some ate high, and some arc 



(A) fyi. 

(B) low. 

(C) quiet 

(D) big. 



Sound is 



6. very 



(A) importaiKe 

(B) importantly 

(C) important 

(D) import 



CO us because it is the basic means of communication. 



In the secor\d type of question, the student must answer questiorw about the 
passage for which he or she supplied the missing words or phrases. 

ScempU Questions 

7. What does stKSfih- in line 3 mean? 

(A) noise (B) motion (C) place (D) piece 
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8. Which of the phrases below is another example of a pleasant sound, similar to 
the phrase in the sentence that begins in liive 2, "like the voice of a friend"? 

(A) Like the ring of an alarm (B) Like the wail of a siren 
(C) Like the honk of a horn (D) Like the song of a bird 

9. Which sentence below has almost d^e same mearung as the sentence that 
begiivj in line 5? 

(A) It is meaningful to communicate with sound. 

(B) The main way communicate is with sound. 

(C) The meaning of sound is basic to communication. 

(D) In order to communicate, we need basic sounds. 

PartD 

In Ais part of Section 2, die student must read a shon literary passage -nd 
answer quesnoivj about it 

SampU Passage and Questions 

The footsteps began about a quarter past one o'clock in the morning, 
a rhythmic quick-cadenced walking around the dining room table. My 
modier was asleep in one room upstait?, my brother Herman in anodier, 
grandfather was in die attic, in die old walnut bed. I had just stepped out of 
the bathtub and was busily rubbing myself widi a towel when I heard the 
steps. They were the steps of a man walking rapidly around die dining room 
able downstait?. 



1. What did the writer hear? 

(A) A soldier nuarching (B) His brodier snoring 
(C) His modier talking (D) A person walking 

2. Where did die sounds come from? 

(A) Theatric (B) The dining room 
(C) Thebarfiroom (D) Thesrairs 

3. What was most of the family doing? 

(A) Listening (B) Woricing (C) Badiing (D) Sleeping 
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4. What was the wnter doing? 

(A) Talking to himself (B) Drying himself 
(C) Bmshing his hair (D) Getting dressed 

5. The bed in the attic was made d which of the following materials? 
(A) Metal (B) Wood (C) Fcathcn (D) Straw 

6. What time did the sounds begin? 

(A) 12:45 p.m. (B) lKX)a.m. (C) 1:15 a.m. (D) 1 JO p.m. 
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Exhibit B.2: Questionnaire for U.S. and Canadian test 
users 

Exhibit B.3: Questionnaire for test users in all other 
countries 
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Exhibit B.l: Cover L«tt«r, General Question 



EDUCATIONAL TESTING SERV ICE 



PRINCETON. N.J. 086- 



■ iLf f^-c"r.= -j..- 



April 17. 1991 
Dear Colleague: 



Since 19S0. the Secondary Level English Proficiency (SL£P) test, developed by Educational 
Testing Service (ETS) for assessing the Englistt4anguaga listening comprehension arM reading 
skills of nonnative-English speaking (ESL/EFL) studwiU In the G7-12 age/grade range, has been 
administered and scored locally in scattered settings throughout the world. The SLEP program 
needs, but does not regularly receive, feedback trom test users ragaitiing th« variety of purposes 
lor which the test is being used, the age/grade lavela end language backgrounds of the atudants 
being tested, perceived strengths and llmitatlona of the test lor particular pucpoMS, end so on. 
Without such feedback, the program is not in e positttlon to judge the extent to which current 
forms of the SLEP are meetirtg the needs of users arM introduce modifications designed to 
improve the overall usefulness of the test from the oecsoeetive e> practitione rs in diverse use 
settings . 

By Inference from information supplied by the SLEP testing program regarding orders for SLEP 
booklets and related materials in recent months, it eppeara that the SLEP is being used or 
consMered lor possible use in one or more programs in your setting. The brief questkHinaira 
enclosed is designed to obtain feedback regairling the types of issues indicated above. Survey 
findings will be summarized statisticaily. end survey respondents will receive a txief summary 
report in which neHhcr individual respondents nor their Institutions will be klentilied directly writh 
pirticular findings. Respondent kleiTUfication, called for on the cover of the questionnaire, is 
needed to facilitate followup inquiries that may b« needed to darily particular questionnaira 
responses and to identify the individuals most dkecUy concerned with use of the SLEP 
examination (to whom copies of the survey summary will t>e sent). A prepaid business-reply 
envelope is enclosed for returning the cornplcted questhNmaira. 

Your assisunc* in completing and returning the questionnaira. or in forwarding this letter and ttw 
enclosures to the individual who is most directly Involved with SLEP use m your setting, will be 
greetly appreciated. 



Sincerely^ 

i/'.r 

Kenneth M. Wilson 
Research Psychologist 



Copy lor: Ms. Slella Cowell 

Director. SLEP Program 

End: Questionnaire and return envelope 



!=ROM A PRACTITIONER'S PERSPECTIVe 



YOU have commeniea on aspects ol the Secondary l-evel Snglisn Prol.aencY test, and 'e'*'^ 
matters More generally, piease descritx tinellv the cnamclenstics ol a standardized test ol ESL/tFL 
proliaencY land reiatec aeveiooer-provaed materials /services) that would Oe most neiplul I useful m 
ErL/ESL assessment contems similar to yours'' 
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Exhibit B.2: Questionnair* 



for U.S. A. /Canada (p.i of 2) 
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Questionnaire for U.S. A. /Canada (p. 2 



of 2} 
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Exhibit B.3: Questionnaire for Other Countries (p. 2 of 2) 



as 



S 5 



i a* "31 5 (7) S ^ . 
c cc cc :^ uj ciO 



O O 

il 



t - 

• S- 



5 S £ 
S.I ¥ 

2 fl> o 
- 0^ 



as CT> — — • 



: 1 1 



^ ?■§ - ~ " • 



•I ^ 



i o 

> 3 



5 « 



o 2 



■ (M r> ▼ o 



_ 5 
S ? 

O 5, ! 



! 

) i 



3 

a 

UJ 

-J 

C/5 



"3 c 



9 • 



1 = 



li? 



^ c - 
"3 o c 



111 

>-U. j; 



•I 



*9 » 







o 
c 












e: ■= 


1 


i 1 
^ i 








Ij 


S r 






s « 


o ~ 








^ r ^ 




' 1 £ 


to ® 




r c 




5 1 


£ 11 


1 1 


> — 


0. S 

UJ — 


• - 

IN -JO 


s s 


i " 








C 



3 
w 
a 

UJ 

cn 



SLEP Survey 



ERIC 



Appendix C: Procedures Followed in Modifying Existing 
Addresses for the current Survey 

As indicated in the text, many of the addresses did not 
specify a pertinent "use-related" title or program. For 
example, many orders were placed by and shipped to school 
boards, school districts, institutional fiscal offices or 
agents, and so on. In order to provide, a more specific target 
for the survey questionnaire, these general addresses were 
modified to include a plausible ESL-related recipient, as 
outlined below: 

1. For U.S. and Canadian addresses involving district-level 
or board-level orders (e.g.. Board of Education, School 
District No. 10, and so on), or orders placed through, or to 
be shipped to, a financial office (e.g., bursar, accounts 
payable), with no individual, departmental, or ESL program 
identification, a program (e.g., "ESL/Bilingual Program" for 
a district) or position/program (e.g.. Director, ESL/Bilingual 
Program) was specified in the survey mailing. 

2. In the case of individual schools, community colleges, 
four-year colleges and universities, or other institutions/ 
agencies for which no specific position/ title/ESL program 
designation was available, a title/program designation (e.g.. 
Director, ESL Proficiency Program) was added. 

3. The "English Department" was targeted in the case of 
general addresses (other than Canadian) for orders from insti- 
tutions clearly identified as schools, academies, colleges, 
and so on, outside the U.S. proper. 

4. For orders placed through embassy, consular, or other 
governmental offices, a position title such as, "Adviser, ESL 
Proficiency Testing," or "Education Adviser," was specified. 

The covering letter included a request that the ques- 
tionnaire be forwarded to the appropriate individual or 
office. 
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Appendix D: Illustrative Items from the Sequential Tests of 
Educational Progress (STEP) : Listening 
Comprehension and Reading Comprehension** 

Brief descriptions of the listening comprehension and 
reading comprehension tests in the STEP series are provided in 
the following two pages. These tests are designed for use 
v/ith native English-speaking students ; the illustrative 
listening test material is for Grades 7-9; the reading test 
material is for Grades 4-6. 

It is instructive to compare these items with those in the 
SLEP test (Appendix A) . The SLEP items clearly are less 
cognitively demanding than are the STEP items. Accordingly, 
ESL students with average scores on SLEP can be expected to 
earn scores on a test like STEP that are below average 
relative to native-speaker norms. 

See the SLEP Test Manual (e.g., ETS, 1987) for evidence 
indicating that native English-speaking 7th graders can answer 
correctly almost all the SLEP items; see also Holloway (1984). 

It would be useful to determine the age/grade level at 
which SLEP items begin to represent a significant cognitive 
challenge for native English-speaking students . 



See ETS (1958) . 
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STEP Listening Comprehension (for native English speakers) 



The Listeiiintj; Qmi prehension Tests 



The members o( the cuinntitlcc on the luteaing eompre- 
hensiOD letL5 are: 

Chaimian — Atlbea £}eery, Ciociiuiati Public Schools 
l^el \ (CrMlc* U.U) 

>rvniour L^ow. Mt<IiJ>.k v j|u-v [■■rhiiirit ^chuul. L lu«. N- * ^.-rk 

*Kilph C Lrtden, ^ir|jhrD> Luliinr 

OmdobU E. PtJmer. Mirhicui I mvrrtiir 

Level 2 (Cr«a<« 10-12) 

*JubB CaOrcr. Lo* Ao^rlc* CooDIT S liuult 



Alice P. Mrrori. lljrrinKi-r lliMh •'thool. Nrir»rk. Nf" Jcftcy 
l^cl 3 iLnin 7<>) 

Milium i.rj>tl. L nivrniiy llich ^l■hool. I'liivtrtitr of MumetoU 
•M»nliy B. Kcflrr. I nivrr»ii) lliftb icfcool, U«i*«fUiT •f Miii»f- 

Natiuu A. Millrr. bill« Kivcr Juokw HiRh Srhogl. Miami, notida 
Uvct 4 (Cr«a«« 4-6) 

*UrauU llocaA. Mrrtniento Luuniv -N-lioolfe. S«crain«ato. CalilomU 
Mildred Patunon. I'uIjIic bcKooli. UtlaiiBctoB. Delawarr 
Chariotlc WrlK L-nivcr»iir o( MitMuri 
*MMil>ers oi planninc fommilir-. Mlh'-flU^rv. CAairman 



What is listening comprehension? 
The student listens to be iiifurroed. to be umpired, tu be 
convinced, or to be enterUined. Whatever bit purpose, 
it is important (or the student to itstcn with undersUnd- 
iog. I'Kree goals, or leveb o( undenuodtng, were esub* 
liahed (or inielligent listening-. 

1. Com pre he na ion What ia the plain aenae o( wbal 
it heard? 

2. Interpreution UTiat was the apeaker uying to 
du? What were the implied meanings o( the meiMge? 
How does wnat is heard relate to other common 
knowledge? 

3. tlvatuattun What are the weaknesaeaand Hrengiha 
u( the speaker's presenution? How valid i* the nteuage 
in the light o( common knowledge? 

A good listener is not a sponpe, absorbing cverylliing 
without difcrimtnation. He listens critically and aelec- 
ttvely. He remembers significant details, but not all de- 
tails. More important he remembers the speaker's mam 
ideas and conclu^!<ins and appraisn them critically. The 
development o( suih critical and selective listening is a 
^ual with which wrliuols ore com erned. The STKH I-inicn- 
ing Comprehension Tests are designed to measure the 
M;liours sucCfU in arhicving this gual. 

Criteria for selection of materials 

How iiia\ ilnldri-n develop essential listening skills m 
school? At nhat grade levels should particular skills be 

■ ■iii]iiM>i/ci) \\ IijI >i-i|U< lil,r lit <li \i-l-i]iril(-lit l> (lt>Sll .lliltf 

if iliildren jre lu fffnw m iMrning ability? 

In sct'lkin;.' ahiMcr< tu thc><- ifutrfttiuns as a basis (or 
te»t developMK-nt. tliese criteria were established: 

■ Listening, siiu.-ttiiiiis should s.iiiiple all types o( listening 
hmiliar lo sludcnts in ihcit school experiences: direc- 
tions and >iin|i|c f\pliiiiatiun!L, rxpusitiun, narration 
< boih simple and fi:.'uraiive i , argument and persuasion, 
arsthciic inalcnal. 

■ Ijin^ua'.'c o-<-d ^liuuld Itc real, that is. "language js tt in 
heard. " ratlu-r ilun laniiuage as it is read. 

• Selections, and tJir questions bav-d un them, must test a 
\ariely o( skills and iinderstandincs. cmphasuing selec- 
tive mciii(ir\ and the iihilil\ to think altout what is heard. 



Skills tested 

Hwic listening skills were idcuttlinl and organized around 
(okr aspects o( what is cuiuitiunicated: mam idea, signifi- 
cant details, orgenixation o( detaiK meaning o( words. 
SiiKx the skills are not isolated (roro one another, many 
teat ( uestions involve more than ooe JcilL Id mo«t caae*, 
bowiver, it is pouible to identi(y a basic skill required to 
answer the question. 

I Ptaio-aense eompreheiuion 

1 To identify main ideas; to select a suitable title ur to 
select a correct statement o( the main idea or cenUai 
theme. 

i To remember the signtticant details. 

3 To remember the sUucture or simple sequence o( 
ideas, 

4 To demonstrate umlerntanding o( drnotative mean- 
ings o( important words. 

II tnlrr|trt>tation i>:.phrr-k-\i-l iiicaningsi 

1 To uiulcfatand the itiiplications u( the main idi-j>; tu 
understand what the ^)eaker is trying to do; lu •«« 
liMW the- mam idt^as mav reveal the speaker's aUi> 
iii.t.-s .iihl prcfiiilur^. !•> ri-.oi;iii^e tite relutiuiiAliip 
uf the s^wakcr 's statcim-Dts to oUicr ideas or to com- 
mon knowledge. 

2 To understand the iniplicatiuii u( significant details: 
to understand how tlie details are pertinent to the 
A[>cak<-r's purpuM:; lu see bow the details reveal the 
speaker's attiiudcs, biases, and prejudices, to see 
relationships ainnni* the details and their validity in 
tlie li>:Itt <>( cuinnioii knuwledge. 

3 1'o understand iiiicru-latiunshipH among ideas and 
to understand the urgani/atiunal pattern well enough 
to predict wliai i> bkflv to (ulluw. 

4 To demonstrate understanding of connotative mean- 
111"^ (if word>; to infer nicaiuiiua (rum the context: 
to understand how words are uitrd to create a mood 
or an arsUietic (eeling. 

III K^atuatmn and apptiraiHHt 

1 To |udpe Uie validity and adequacy of the main idea; 
10 distinguish (act (rom fancv; to distinguish prob- 
able (art (rom opinion and judgment 
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STEP Listening Comprehension, concluded 



Listening Cdui{>reheu6ion 



2 To judge '-he eiiciit to whicli llic tuppn.-'.ing lU-tatU 
•ccompltth their purpose; to dislmgutsh amunt: rel- 
evant ftftd irrelevant details; to judpe whether or 
not mot'e information i*- nii-dcd tti prove the 
ipeaker't point 

3 To evaluate the organization and dcvi'lupiiteni of 
what ii taid; to be aware of s<'if-cuntradictiont: to 
recognize the device^ the s|K-akcr u&cs In mflurnce 
the listener's thinLiii^'. 

4 To judge whcUier or nut the 9pt.-aLi.r cit- Jit J an 
intended mood ur cflei.1 — and if tlic »|K.*aLi-r hjs 
failed, to understand why, 

"i To recognize what the rpcaker wants the ll^ll*nr^ to 
ilo and lo recognise wa>s m which tlie siKuiiier'j 
ideas may be applied properly in new situations. 



Administration of tho tosts 

The dcvelnpntent of •Umlardizril tcsi< in li.'ai-nmg preMiilii 
important problems of test adii-iiii>irjtiun, The alternative 
plans evaluated by the cummtUi'c were to use tape record- 
ingH or to pretcribc that the aelettions be read aloud by 
clasaroom teachers lor the te^t administrator), tlach 
method has advantage* and tlit^dvantage*; the evidence u 
not conclusive. The commillee concluded tliat the evidence 
favored oral pre»etttation [^^ classroom tcai hers. This t\ pe 
of presentation makes the tc^ty less cipeitsive anci docs not 
require equipment for playing recordiitjis. 

N(Mc: I'lir irrni "anJin^:" i> ji inc Mi<lr <-iiririii*v lo ilrrKKc 
the rill lie |iIim fit (iitrnlinl tirfc a> li>><-niM|t corniirrhrnttoa. 
NihrcL* of mdtrmaiiun jUtiii iln. ritjii\<K ik-m firlil nisy ttc 
f.<iiii<i in itir fieiltw of t.iliicoiiuiiul Htt^tik. A|iril I'J'i'i. 



Sample of linU^nin^ test material 

BecaUM.' li«triiiil^ cuMipr. In'iisioii i» h*")!-!! illl fun \\ Iniit: 
dicidd'd paMtj^t's. tliere -pjic lure f<>r .i >,iiti|ilc from 
only line levuL The ciaiiiiniT n-atU the pJaAjgi' nme and 
then rcatk aloud t^jih the i^ucstions and the possible 
ansurrs. Tlie student has before him a buuklct titat givea 
unK liii- |(.i-.~iliic answers — and an answer ilicet. 

Lr\fl :t (i;ra<lf^ 7*9) Kratlmg tinit^l min., 30 tec. 

The t-\aiiiiner readd: 

Urrc IS ihe fourth selection. It is a sftccvh by a student 
tunning: fur si'huol ofTice. 

A students. U students, C liudents, D students, and my 
fnenils! As >uu know, I am running for the uflice of Presi* 
dent (if the Student Council. Td like to tell you what I'll do 
if I'm clrcted. In the fir^t place, I think several studenta 
ou(!lit u> »it in un tcALlirr-' meetings. They settk too many 
thin^-<> liir lis, I don't think that llie teachers always know 
Mliat'v \K-^i fur us. 

Ill ihr sicund |iljce, I'd like to see our Student Council 
do SiiiiK-thin;:. Take ihr bustncNS of the candv marnine, for 
insianuc. Ju»t Ix-cause a couple of doctors and dentists don't 
like tt duc-n't tncjii we shouldn't have one. I think thr> are 
wrong. I tJiink we should have one. Candy is goud (or us. 
It pivtrs us energy* and t. for one« don't thtnk it hurts either 
your tcetli or your appetite. And if it dues, so ^hat :' You 
uve the lunch money and can go out on a date. 

Ijiftt, >ou know that my opponents — and vou'll hear from 
them in a minute — are two pirls. Now, everybody sa>9 girls 
are <nia'ter dian boys. That might be true — but just be- 
cauM' they're smarter doesn't mean they'll make betlet 
ofTiLcr*. In fai t, I thirik girls are too smart and can't alwa>a 
get alunp with pcnpl- t>ct.-juM< nf that. Ma\lte we need 
snmcI)od\ not so amart, but tlial can pet along Tli.ii'v me. 
fellitw «tudriit>'-vote for tm-' 



IV 11k' ^|K4Ler's prmcj|ial uUjcihnn )■• girU mIimoI 
ultici'is evidently is that tliey 
A talk too much 

U support the teacher's puiiit-of-\ iew 
C are too smart to get along with people 
D don't want a candy machine 

20 It is likely that in the past the s^waker hns 
K disagreed with the teacher*' decisions 

K disagreed with the opinions he has stati-d 
(• agreed tuith the doctor about the candy marliin<- 
H agreed with hia other opponents alwiut dei i-nniv .>( 
teacher* 

21 By uying **A students, 11 students. C ktiir>enl>. D >iu- 
dents, and my friends,** the speaker is trv inic rlnrlK in pi't 
all the students 

A at the top of the class to vote for hiin 
U at the buttom of the class to vote for him 
C in the school to vote for him 
D who a^ree with htm to vote lor hiin 

22 When tl»e speaker usetl (be word "nppDneni- ' he 
meant 

E students from other tcho^ils 
F students runnmg against him 
C the teacher* 
H doctor* and dentists 

23 Judging from his comments, how does the spcakor (rrl 
about the opinions of experts? 

A He pretends that the eiperts agree with him. 
n He does not respect theriperts if he disagrees with them, 
C He pretends lo treat the eiprrt« wiib reaper l 
l> He follows expert sd\ii-r uiili'^s he ran prnve that it is 
wrong. 
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STEP Reading Comprehension (for native English speakers) 



The Reading Comprehension Tests 



Tbe rneinbfri ol ihc i' 'tmuHee on reading coniprehen»ion 
are: 

Chairman — Consunce M. McCuUough, San Frtnciico 
Stale College 

X^tl 1 (Cnidn 1314) 

•RoWn M. Bfit. Dinmoiiih CoUcic 

Philip ^htM. Urooklyn C^uUrite 

MtcLlin Thoouk. Cliictco Lixj Juaiot ^'.iiUi icr 

Level 2 (Cndet 10-12) 

'LuelU fi. Cook.. MmnruDulii Public hm>U 



Doroihy t. Nk-T.uJloiich. TuJor Htll ScbooL MuumpoIm 
M. MrriuB Ptfc. Oakwoed Hifjn ScbwL Dtytom, OKm 
Level 3 (Cr><te« 7-9) 
'Lautca L Briak. Lnivtrtily of Ne«td« 
Ildca F. OUn. l^tueca Anne Mith School Senile 
Jcrrv Hri-J. kriincr Jiiiimr IIikIi '^IiooI. Drntcf 
Ijtwti 4 (Cr«<te« 4-6) 
llancT Alpert. l-nivcr»iiv ol Flonila 
Kob^n 0. Simpkon. ^«^ KraniiKu Puhlie S<.hoeU 
*C^orsc 0. Sp«cbe. Liniveruiy o( FloritU 
*Me«beT» of pltnniaf oommutec. CwaOaacc M. Mc<'iilkHith. 
CAamiMR 



Purposes of reading 

0( all ihr iiia<>ir thai education prnducn. there is litllr 
compjr.iMt- tu what happens when a child learns to read. 
AnJ H lien thr school helps ihe child to read with increasing 
tkill anil iii«ii:lil. it rxtciiJs the magic by giving him a basic 
tuitl fur iimlirstanilm;! hiniMrlf and his world. Wlutever 
dilTririii i-> ihcre iiijv be regarding the ftoaU of edui-niion 
— aniJ t(ii-rc arc niaiiv — the development ot reading »kill* 
fur dll iliiidrcn rciiiaiiis the first of the three R's. 

Pus&ilily this IS uliv more proprest has been made in the 
measurement of reading comprehen:fion than in other areas 
of edui aiional testing. A number of educator* and psvL-hol- 
ogists lijve made lists of abih!in which are thought tu add 
up to the abilitv to read wril. The cummittee on reading 
hid ilie advantafce of acquaiiuanfe with the lung historv of 
analvtical wurk in reading comprehension and knowledge 
uf the lypcs uf testing mati rials that have been uM^d sue- 
cessfulU in the past. 

The purpose ul the SIKI' it-ailmg tests is to evaluate 
atudrnt abilit\ to read new luairnaU with comprehension, 
insight, and critical understaiulitip. The tabk of the com- 
mittee was niit so much to explore new ground (as in the 
WTitinf: and li^trnmp trstst as it was tu develop a plan 
which wr)uhl take advantafte of iho^ current and past de- 
vclupnii iits M'hich art- nio*^t i-lo»el\ related to the philosophy 
of the Si ri' iiro;iraiii. 

Criteria for setef:tion of materials 
Resdine passa;;es should be* 

* tntrrt-vting to the pupiU li-stcd. and neither obviously 
dated nor offensive in an\ wav. 

* of a kind similar to those read by pupils in their 
ordinary si houl and life situations, but not likely to be 
familiar to ftudi-nts takmji thr tests. 



" crucial m valui 

* distributed in dilhiultv JLfos^ -.everal grides for tests al 
each level 

* more or Ivt^ Mif-i ontamed and rrprrsentative ol a 
vanrtv of Ivp^^ uf reading, a varirtv of fields and con- 
ti nt, and a vanetv of tiiedta of written lommunication. 



Types of materials 

To test lireadth of student drv rlopriicnt in reading skill*, 
the selections represent a widr range of content, hut the 
tests do not emphasize understanding of cnncepts or devel- 
opcti abilitv in anv of the subject areus. Mitreovcr, wrlcc< 
tiont lontain the information iii*eiled to answer Uie ques- 
tions. Hitwcvt-r. the tests do measure the cum: with which 
students rrad tn the various cuiiti-nt areas. I'lliis dilTtrrcncc 
in purpiiae can be M.t-n bv i-<iiiiparinf* reading lest selec- 
tions with uemb u-i-d in the Mjiriii-e and tueial studies tests. ) 

What ar« the reading skills tested? 

So far 3> |jo>sibli'. questions on each pasu(;e are HiitriU 
uted amour; five general categories of skills identifiril. 

1 Ability to understand direct itatemeiita itude 
by tlie outhor: to understand denoutive itK-anings 
ol wunls: to idriitify parallel statements; to ri'<t(f;nize 
paraphrases, to recognize a correct statement "f time 
sequence; to identify thinga mentioned most fre- 
quently. 

2 Ability to interpret und uimniarixr the |i«saagei 
to select a suitable title; to identify the type of passage 
I fiction, history, etc. I : to draw inferences from slate* 
ments made by the aut!:or: ti> understand connotative 
meanings of words. 

3 .Ability to aee tite motive* of ibc autliori to be 
able to state the author** purpose; to understand why 
the author included or excluded certain things; to 
idrntifv the ioiie of the lu^sagc. 

4 Ability to obaerve ll»e orf(«fiix«tion«l clt«r«cler^ 
iitira uf tIte paawge: to recognize where divisions 
might come in a single long paragraph; to state tbe 
main topics of separate paragraphs; to uoderttaod tbe 
l.jsikon wliicha passage is organized. 

> Ability lu criticise the paaaaKc with rcapect to iln 
idea*, piirttoaea, or pre»enl«ltoai to judge if art 
argument is unsupported, to identify a valid objeclioD 
not answered by the author; to judge efTectivenCM ol 
devices used by (he author \ metaphor, example, etc. ) ; 
to be aware of basic asaumptions the author expect* 
the reader to take for granted. 
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Reading Coniprehensiou 



Samples of reading test material 

Reading pasMge* were 6r»t cUauficd According to grade 
lereJ of dii&cuJty. A act of paaaage* repreaeniing the aerea 
types of reading malertal wa* then aeleclcd for each level 
Teat ilemt I qucaiiont j were deaigned to cover as many of 
the reading skilU aa pouiblc. Tfaeae iiema were then re> 
Tiewed. revised aa neccsaary. and arranged into whole testa. 

The following Mmplea from Level i (Grades and 
Level 2 i Grades 10-12) illustrate some of the kinds of 
questions used. Both selections are classified as "opinioo 
or interpretaiion" and require use of abihties in several 
categuriea. 

Uvel 4 (Grades 4-6) 
Dear Bill. 

It w«s fun to be on the farm. Yesterday morning. Jack 
and 1 vtahhid Aunt Mary make butter. She did not need 
to u!« all her cream to oiake butter. She sent most of the 
cream to the creamery. 

1 vti>-li I were a farmer. I would take just a little cream 
for liitttcr. lUvn I would um all the rest of llie crrain to 
make icecieani. Wouldii'l that be fun? 

I'm s-irrv yuu could not go to Jack's farm r>vith ine. 1 
had tlitr tmie of my life. Kvery day. Jack kept finding so-ne 
new thmg to do. 

W'e rtHle Jjik's bur'^. We worked ari>und the barn. We 
fed the animals. We pjve corn to the hu^i in their pen. 
WliJt it iiuim: a hup I an make. We gave liav tu the horses 
siiil tbe sliitp ami the liltle lamb. 

I i:ame back to town yesterday. I must sav good-by for 
now Wriie %*ion. 

I uur couAtn, 
Hetty 

O In Ihi» Ktlt.r. lkU\ is trying tu tL-ll 
K Ituw to make huller 
V what she did at the (arm 
(* Hiiai hurv^ eat 
II huw mui.h noise a hujt makes 

K U'huh uf thi'se ihinps that Uvtty said Ulls best hnw 

^he feels about livmg on a farm? 

K We Wot Led sround the barn. 

K I came hai k to town vt*»terdov 

(* I wish I were a farnirr. 

II We rode Jark's horse. 
^ The Itller i^ ha|i|f fiiejil where Betty it 

A saving Ihll tttuldn't cutne 

L5 telling about inling thr liiir*^ 

{'. having! to ••av ;:(iod-bv 

I) trlling abiint the cream 
10 W ii. r,- IktU li^r? 

K 111 the Hiiiuiitjiii^ K On a (arm 

(r Ni'sr thi- iH-ian II In a town 



Uvel 2 (Cratica 10.12) 

In tura-of-the-ccntury vaudevtUe, folding beda were favor- 
ite comedy props, but the many deacendanU of thote early 
{olding beds are no Uughing matter. Today's smaller bomes 
call for furniture that conaerves space by serving more 
than one purpose, and the modern "convertibles" are going 
far toward satisfying that need. They can turn the trost 
proper living room into a dormitory that wiU sleep nine 
people. Convertible furniture is giving American home- 
makers the imagmative engineering, improved design, and 
remarkable mass- product ion prices associated with honte 
appliances. This development has provided the biggest 
bome>furnishing news in recent years. In 1940. United 
States families spent about 22 million dollars for convert* 
ibie sleep furniture: now, they are spcndmg six times thai 
amount for beds that hide in the living room during the 

<Ur. 

1 1 The information in this passage would be of interest to 

A huu^Mivfi L5 furnitur; tnanufarturers 
i'. buMTs fur furniture Atnri'^ D all <if tlu-^r 

12 It is evident from the article that 

E furniture designers are concentrating on the nevds 

of small huufes 
K today's smaller houses require smaller furniture 
(» modern bedrooms will have tu ac<.-omniodjte mure 

than two persons 
II old-fashioned furniture can be converted to fullill 

today's requirements 

13. Which of the following tet- hniques does the author use 
to make hia presentation of ideas efleciive? 

A Supporting a statement with specific pr<Hif 
B Giving figures 

C Listing advantages D sU of ihrse 

14 United States families buy convertible furniture tuday 
at an annual cost of 

K over 100 million dollars 
F 66 milhon dollars 
G 22 million dollars 
H 6 million dollars 

15 In the sentence beginning m line 6 ("They can turn 
. . .") the author ia 

A adilin)t an entirely new idea to hii article 
B illusirsting the meaning of the preceding sentence 
C generalising from the preceding sentence 
D making a general statement which will be followed 
by an example m the next sentence 
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Endnotes 



1. The SLEP test was originally administered only in scheduled 
administrations at international test centers established by 
ETS, a practice that was discontinued in the early 1980s. 
However, since that time, SLEP has been continuously available 
for local administration and scoring, to qualified 
institutions, agencies, or individuals, for local 
administration, through a program that has come to be known as 
the SLEP School Services Program (ETS, 1991) . 

2. Two answer-sheet formats are offered: a three-ply format, 
in which sheets 2 and 3 record direct images of the correct 
responses only, and a single-sheet format that requires the 
use of scoring stencils. 

3. Perspective regarding problems, issues, and practices in 
ESL proficiency assessment in secondary school settings was 
gained through discussions with individuals in the New Jersey 
Department of Education regarding the evaluation of 
federal/ state-funded programs for students of Limited English 
Proficiency (LEP) ; also through a meeting with individuals 
responsible for ESL/bilingual programs in the Princeton (NJ) 
area, and the director of an ESL program for international 
students at a private secondary school in the same general 
area. 

4. Individuals concerned with ESL/Bilingual programs in three 
New Jersey secondary school settings completed and commented 
on the draft, as did three university-based ESL program staff 
members. The draft was reviewed by the TOEFL program 
representative in Canada. 

5. In the U.S. and Canada, ESL proficiency testing is mandated 
for use in evaluating certain ESL/bilingual programs sponsored 
by federal/ state/provincial governments. This reason for test 
use was not anticipated for other countries, in which it was 
anticipated that the SLEP test might be used to assess gains 
in proficiency associated with academic "English as a foreign 
language" (EFL) instructional programs. 

6. The principal source of records regarding "users" of the 
SLEP test was the general systems file maintained for fiscal 
accounting purposes. In many instances, orders for sets of 
test booklets (Form 1, Form 2, and/or Form 3) and related 
materials are placed by or through a business or purchasing 
office or agent. Neither the program in which the SLEP test 
is ultimately used nor specific name/ title/program identi- 
fication for the responsible test user is available. More- 
over, systems files, intended primarily to meet current 
operational demands, are not designed to provide a consoli- 
dated, historical record of transactions by purchasers. 
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7. Traditionally, all orders for the SLEP test have been 
processed by the ETS (Princeton) office. In recent years, the 
TOEFL representative office in Canada has processed orders 
originating in that country. 

8. The survey questionnaire did not ask for information 
regarding the type of answer sheet and scoring procedure 
employed. In relatively "high volume" contexts, scannable 
answer sheets and computerized scoring procedures undoubtedly 
were used (a practice that was specifically reported by only 
one respondent) . 

9. The SLEP School Services Program has made the test 
available "... for purchase and use ... by post secondary 
institutions, training agencies,, educational consultants, and 
others engaged in legitimate testing activities" (e.g., ETS, 
1988, p. 8). Use with ESL students whose age/grade placement 
is below the G7-12 range is indicated by informal feedback 
from the field. 

10. As will be seen in a subsequent section, these items aiso 
were mentioned unfavorably by respondents in free responses 
identifying "positive" and "negative" features of SLEP, 
generally. 

11. See Appendix B for differences in the detailed 
specification of purposes for testirg--for "placement" and for 
"assessing average (net) gain" — in questionnaires for 
U.S. A/Canada and other locations, respectively, that are not 
directly pertinent here. 

12. In subsequent correspondence with one respondent to the 
survey questionnaire, in a use-context in which placement 
decisions involved a composite of interview ratings, essay 
ratings and SLEP scores, it was learned that the variable of 
major interest for local "normative" purposes was the 
composite, not SLEP or other component elements — whose local 
distributions were well known. 

13. Several respondents enclosed documents describing local 
studies and/or study outcomes; one respondent enclosed a 
report describing a study of change in test performance 
associated with intensive ESL instruction. 

14. Strength of association was rarely characterized 
statistically; only two respondents reported a correlation 
coefficient to indicate strength of association between 
measures. In a number of instances, the "other variable (s)" 
involved were not explicitly described. 
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15. Illustrative responses in this category include the 
following verbatim comments by respondents who indicated that 
a study had been conducted: "Since SLEP is used for placement 
purposes, teachers' comments after placement are very 
important for any adjustments in placement" (from a U.S. 
university respondent) . "Compare SLEP scores with academic 
grades and teacher judgment" (from a U.S. high school ESL 
teacher) . 

16. This is likely to be the case in most similar testing 
situations (see, for example, Cummins, 1983; the SLEP Test 
Manual [e.g., ETS, 1987, pp. 34-35]). Also, compare the 
listening comprehension and reading comprehension items in a 
test for G4-G9 native English speakers (Appendix D) , with the 
corresponding SLEP items (Appendix A) . 

17. NCEs represent a transformation of percentile 
distributions to a standard scale (mean =50, sd = 22) that 
permits "equal interval" comparisons regardless of score 
level. This index is widely used in conjunction with mandated 
models for assessing average (net) gain in test performance 
for students in federally or state funded remedial programs in 
the United States. 

18. As put by one ESL/Bilingual supervisor: "Although SLEP is 
not a criterion referenced test, it would be helpful to know 
what underlying skills or curriculum goals, if any, are 
addressed by the test items." Expert classification of test 
items according to "skills/ functions" appears to be feasible, 
and would permit useful extension of the information provided 
by SLEP. 

19. None of the respondent's' indicated precisely how SLEP was 
used to monitor the progress of individual students. However, 
one respondent expressed keen dissatisfaction with SLEP 
because some students had lower scores when posttested than 
they earned when pretested — a phenomenon that reflects factors 
subsumed under the rubric of "errors of measurement" — although 
the group as a whole apparently registered an average (net) 
gain. It would be useful to include in the SLEP Test Manual, 
a brief discussion of the problems associated with using a 
simple test-retest model for evaluating the progress of 
individual students (as opposed to the use of such a model for 
assessing average change) . 

20. Examples of available evidence bearing on SLEP ' s validity 
for use with college-level samples include the following: 

(a) One survey respondent reported correlations averaging 
.63 between SLEP total score and professionally rendered 
ratings of oral English proficiency (based on formal 
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interviews) and ratings of writing samples, respectively, for 
Japanese students (N = 1,648) planning to enter a college- 
level English-medium program in Japan; other college-level 
SLEP users reported favorably on SLEP ' s usefulness (validity) 
for ESL assessment purposes. 

(b) SLEP scores have been found to be relatively closely 
related to TOEFL scores in one sample of college-level 
students — correlations centering around .80 in a sample of 
students in a college-based intensive ESL program, with a mean 
of 519 on TOEFL and 55 (80th percentile) on SLEP (e.g., ETS, 
1988) . 

(c) A study (Butler, 1989) of SLEP performance of ESL 
students in member institutions of the Los Angeles Community 
College District (LACCD) indicated that SLEP performance of 
independently established proficiency-level groups varied 
systematically with placement level, and that the test items 
were at an appropriate level of difficulty for the students 
involved . 

(d) A study (Rudmann, 1991) involving ESL students at 
Irvine Valley (CA) Community College, found that SLEP scores 
were related positively (average levels approximately .40) to 
grades earned in English courses; this despite the fact that 
the students were assignea to the respective courses on the 
basis of SLEP scores, with attendant restriction of raqe on 
the test within the respective course-level samples. 

Findings such as the foregoing, constitute what appears to 
be "conceptually persuasive" evidence that SLEP can be 
expected to provide reliable and valid discrimination in 
samples of college-level ESL students. Further evidence is 
needed — validity assessment can never be considered 
"complete . " 

21. ESL proficiency assessment, mandated in connection with 
governmentally funded programs for students with limited 
English proficiency, typically must be conducted using only 
"approved" tests and procedures. For example. New Jersey 
administrative codes specify that "an English language 
proficiency test, in the areas of listening, speaking, 
reading, and writing, must be administered to those pupils 
with another language in the.\; background .... The Language 
Assessment Battery (LAB, 1982) and the Maculaitis Assessment 
Program (MAC, 1982) are the tests used for this purpose. 
However, other language proficiency tests may be used, as long 
as the tests have validity and reliability, measure the areas 
of listening, speaking, reading, and writing, and have been 
aligned to the state norms established for the LAB and MAC 
tests" (New Jersey State Department of Education, 1990: p. 
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10) . Remedial work may be focused primarily on particular 
skills (e.g., reading comprehension). "Norm-referenced 
models" have been developed for use in evaluating programs. 
ESL proficiency testing (along with assessment of i7K}e/grade 
appropriate subject-matter attainment in English and/or LI) is 
conducted locally, using tests selected by local districts 
from lists of state-approved tests. Currently appro-ed tests 
may lose "approved" status if their norms are more than ten 
years old (see, for example. New Jersey State Department of 
Education, 1990, p. 130) . 

22. Respondents in a number of SLEP use contexts indicated 
that they would like more information regarding the strength 
and consistency of relationships between SLEP scores and 
scores on TOEFL. Tha TOEFL appears to be the "ultimate" 
challenge for ESL students in many SLEP use contexts--indeed 
one measurable goal of instruction mentioned by several 
respondents was the attainment of a particular TOEFL score. 
For the informed guidance of practitioners, information is 
needed regarding the typical level and range of TOEFL 
performance that can be expected (concurrently or after some 
designated period of instruction) for examinees with 
particular scores on SLEP. Accordingly, it would be useful 
to collect data needed to extend evidence regarding SLEP/TOEFL 
relationships in samples from both secondary-level and 
college-level SLEP use contexts. In any event, in reporting 
on relationships to test users--e.g., indicating expected 
TOEFL scores for given SLEP score ranges, as in the SLEP Test 
Manual (e.g., ETS, 1991, Table 16, p. 27) — an "expectancy 
table" format, rather than a simple "table of equivalents," 
should be used. One survey respondent expressed considerable 
dismay upon discovering that the actual TOEFL scores earned by 
her students were frequently considerably at variance with the 
"equivalents" indicated in the SLEP Test Manual. Seeing the 
scatter of TOEFL scores for examinees in designated SLEP score 
ranges should help users to form realistic expectations. 

23. ESL professionals in college-level settings, interviewed 
a decade ago (Hale and Hinofotis, 1981) , also reportedly 
stressed "... the need to assess productive as well as 
receptive skills" (p. 9) for placement purposes. 

24. The results of criterion-related validity studies 
involvin'-j "common criteria" can be expected to provide useful 
general guidelines for test interpretation, based on the 
results of .studies that have been designed explicitly to link 
level of pe]-f ormance on indirect, norm-referenced measures to 
quasi-absolute proficiency scales, using ratings of classroom 
ESL teachers (e.g., from the TOEFL testing context, see Boldt, 
Larsen-Freeman, Camp, & Levin, in press; and from the TOEIC 
testing context, see Wilson, 1991). 
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25. The reference here is not to L.he development of a test 
with multiple scores to be employed in identifying strengths 
and weakness of individuals, but to the use of average scores 
on items designed to measure particular skills to identify 
skill-areas requiring more/less emphasis in instruction, or to 
compare groups with respect to profiles of skills--that is, 
information that can be useful for evaluating or planning 
instruction. As noted by Hale and Hinofotis (1981: p 20): 
"It is possible to employ a basically integrative approach 
with tests focusing on the assessment of the major skills in 
an appropriate context and, at the same time, to provide a 
breakdown by subskills or objectives within those major 
skills. " 

26. This approach was employed by Clark and Swinton (1979) in 
their study concerned with the development of the Test of 
Spoken English (TSE) . The final selection of TSE items was 
based in part on patterns of correlation with ratings of oral 
English proficiency. 

27. See Hale and Hinofotis (1981 — pp. 20-22) for illustra- 
tive analytic approaches to the problem of providing ". . .a 
breakdown by subskills or objectives within . . . major skill 
areas (tapped by an integrative test)." 
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